DOC0REHT BESDbE 



ED 121 819 



TH 005 250 



iUTHOE 
TITLE 

INSTITUTION 
SPONS AGHNa 

REPORT NO 
POB DiTE 
NOTE 

EDRS PRICE 
DESCHIPTORS 



IDENTIFIERS 



HcBriae, Janes H« ; Weiss, David J* 

Some Properties of a fiayesian &3aptire Ability 

Testing Strategy* 

Hinnesota Univ* « Minneapolis* Dept* of Psychology* 

Office of Naval Research, Washington, d*c* Personnel 

and Training Research Programs Office* 

RR-76-1 

Bar 76 

«ttp* 

l!F-$0*83 HC^$2*06 Plus Postage 

♦Ability; Bayesian Statistics; Branching; ♦Computer 

Oriented Programs; correlation; Guessing (Tests) ; 

Item Banks; Scores; ♦Sequential Approach; 

♦Simulation; Test Bias; ♦Testing 

♦Bayesian Adaptive Ability Testing; Tailored 

Testing 



ABSTRACT 

Four monte carlo simulation studies of Owen*s 
Bayesian sequential procedure for adaptive mental testing vere 
conducted* Whereas previous simulation studies of this procedure have 
concentrated on evaluating it in ter^is of the correlation of its test 
scores with simulated ability in a normal population, these four 
studies explored a number of additional properties, both in a 
normally distributed population and in a distribution-free context* 
Study 1 replicated previous studies with finite item pools, but 
examined such properties as the bias or estimate, mean absolute 
error, and correlation of test lengtb with ability* Studies 2 and 3 
examined the same variables in a number of hypothetical infinite item 
pools, investigating the effects of item discriminating power, 
guessing, and variable vs* fixed test length* Study H investigated 
some properties of the Bayesian test scores as latent trait 
estimators, under three different configurations (regressions of item 
discrimination on item difficulty) of item pools* The properties of 
interest included the regression of latent trait estimates on actual 
trait levels, the conditional bias of such estimates, the information 
curve of the trait estimates, and the relationship of test length to 
ability level* The results of these studies indicated that the 
ability estimates derived from the Bayesian test strategy wejre highly 
correlated with ability level. However, the ability estimates were 
also highly correlated with number of items administered, were 
nonlinearly biased, and provided measurements which were not of equal 
precision at all levels of ability* (Author) 



Documcrts acquiretJ by ERIC include many infonnal unpublished materials not available ftom other sources* ERIC makes every 
effort to obtain the b^st copy available. Nevertheless, items of marginal reproducibility are often encountered and this affects the 
au^tv of the microfiche and hardcopy reproductions ERIC makes available via the ERIC Docaiment Reproduction Service (EDHS)* 




, is not responsible for the quality of the original document. ReproducBons supplied by EDRS axe the best that can be made from 
^giinal. 



1 



1 1 



1 1 



SOME PROPERTIES OF A BAYESIAN 
ADAPTIVE ABILITY TESTING STRATEGY 

James R. McBride 

AND 

David J. Weiss 



EDUCATION 

*TIWG »T POINTS W VIEW^^^ REPPE- 
'ewSSS^S^TlONOP POLICY 



Psychometric Methods Program 
Department of Psychology 
University of Minnesota 
Minneapolis, MH 55455 



Research Report 76-1 
March 1976 



Prepared under contract No* N00014-76-C'0243, NR150-382 

with the 

Personnel and Training Research Programs 
Psychological Sciences Division 
Office of Naval Research 

10 

Approved for public release; distribution unlimited* 
Reproduction in whole or in part is permitted for 
any purpose of the United States Government* 



ERIC 



Unclassified 

M.l'llllll Y < ( A-.MMCAIIfMi ;>i itii > l~( f (•■'i>^' tuit ]->.Mr*i() 







Research Report 76-1 




4. TITLL (*n*f S'thtUU) 

ouuu^ TLuptfrtieo or a j>aye9ian AQapcive Ability 
Testing Strategy 


TYI'L OF flLf^Oin 6 FU MIOD COVl'Ki:!) 

Technical Report 

<. t'LHf OltMlHG one*, tr »-OftT NUl^uCil 


James R. HcBrlde and David J. Weiss 


N00014-76-C-O243 


Department of Psychology 
University of Minnesota 
Minneapolis^ Minnesota 55455 


P.E.:61153N PROJ. :RR042-OA 
T.A. ;RRu*i2-0*i-01 
W.U.:NR150-382 


1 n CCH1 itOLLi^iG orrict^ HAi^iC lud Aoon^ss 
Personnel and Training Research Programs 
Office of Naval Research 
Arlington, Virginia 22217 


March 1976 


. 




Unclassified 

■ la.**. t:""CuACf.UICA"It>.yl^O\iMGf<\t>ir!x; 
E^OlbDliLt; 


tC. clSTHIUtJ iiOri SI ArtlMLM V (at tMt lC«P«rf> 

Approved for public release; distribution unlimited* Reproduction In whole 
or In part Is permitted for any purpose of the United States GovernTnent* 


l>. t>lSTni!sLUI lOH t^T ATCVt^HT (oi f/tt 4btl/tc/ tnitU^ tn Clock 70, f{ illfiotttii ititm R9i*<*tJ) 


te. SUfTLcOullYAHY NOTES 

Portions of this paper were presented at the Spring 1975 meeting of the 
Psychometric Society, Iowa City, Iowa, April 24* 1975, and the Conference on 
Computerized Adaptive Testing, Washington, D.C., June 12, 1975. 


testing sequential testing programmed testing 

ability testing branched testing response-contingent testing 

computerised testing Individualized testing automated testing 

adaptive testing tailored testing 1 
* 


Four monte carlo simulation studies of Owen's Bayeslan sequential 
procedure for adaptive mental 'testing were conducted. Whereas previous 
simulation studies of this procedure have concentrated on evaluating It In 
terms of the correlation of Its test scores with simulated ability In k 
normal population, these four studies explored a number of additional 
properties, both In a normally distributed population and In a distribution- 
free context. Study 1 replicated previous studies with finite Item pools^ 
but examined Such properties as the bias of estimate, mean absolute error, . 



r« 1-73 cnriw.o, i,.ov«Msc:,ioLCTt: U nclassi fied 



Unclassifie d 



and correlation of test length with ability* Studies 2 and 3 examined the 
same variables in a number of hypothetical infinite item pools, investigating 
the effects of item discriminating power, guessing, and variable vs* fixed 
test length* Study 4 investigated some pru^ierties of the Bayesian test scores 
as latent trait estimators, under three different configurations (regressions 
of item discrimination on item difficulty) of item pools* The properties of 
interest included the regression of latent trait estimates on actual trait 
levels, the conditional bias of such estimates, the information curve of the 
trait estimates, and the relationship of test length to ability level* The 
results of these studies indicated that the ability estimates derived from 
the Bayesian test strategy were highly correlated with ability level* 
However, the ability estimates were also highly correlated with number of 
items administered, were non-linearly biased and provj^ded measurements 'which 
were not of equal precision at all levels of ability- 



1 



ERIC 



Unclassified 



CONTENTS 

Owen*s B^yesian SequentiaX Adaptive Testing Strategy X 

Study X: An **XdeaX*' Xtem PooX with VariabXe Test Length 4 

Background and Purpose 4 

Method ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 5 

VanabX&s 3 

X t&m pooXs ttttttttttttttttttttttttttttttttttttttttttttttt 5 

Response generation and test adm^-aistration 6 

SvaXuative criteria 6 

HesuXts ttttttttttttttttttttttttttttttttttttttttttttttptttttttt 7 

ConcXusions 8 
Study 2: Effects of Guessing and Xtem Discrimination in a Perfect 

X t em PooX 8 

Background and Purpose 8 

Method **********************************************^^^^^^^^^^ 9 

VariabXes 9 

X tem pooXs ***************************************'*^****t^t^t^ XO 

Response generation and test administration XO 

AnaXysis XO 

ResuXts XO 

No-guessing condition XO 

Uncorrected-^guessing condition X2 

Cor rected-gues sing condition X3 

OoncXusions *************************************************** X3 

Study 3: Effects of Fixed Test Length X5 

Background and Purpose X5 

Method ******************************************************** X6 

Var iahX es ************************************************ X6 

It em pooXs *************************************'.*******>* X6 

Response generation and test administration ***> X6 

AnaX ysls ************************************************* X6 

ResuX ts ******************************************************* X6 

OoncXusions *************************************************** X8 

Study 4r Effects of AbiXity LeveX and Xtem PooX Configuration X8 

Background and Purpose X8 

Method ******************************************************** X9 

Vari&hXes ************************************************ X9 

Xtem pooXs *********************************************** ^0 

Response generation and test administration 2X 

AjiaXysis ********t**************************************** ^X 

ResuX t s t.****************************************************** 23 

Regression of @ on 0 ************************************* ^3 

Bias 23 

Xnformation ***************************^****************** 23 

G^neraX Summary and ConcXusions 26 

XmpXicatlons ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 30 

References *****^*************************************************** 3X 

APPENDXX: SuppXementary TabXes 33 



5 



Some Properties of a Bayesian 
Adaptive Ability Testing Strategy 



Adaptive or tailored ability testing subsumes a number of different 
strategies for adapting the difficulty of test items to the examinee's 
ability level. All the adaptive testing strategies have as one objective 
the improvement of the psychometric properties of mental test scores 
throughout the range of the trait of interest (e.g., ability). This is 
accomplished by adapting test item difficulty to each examinee's ability, 
during the test itself. Ideally the adaptive selection and administration 
of test items would result in each examinee answering only those items 
which are most informative for his own ability level. Additionally, where 
items can be answered correctly by random guessing (e»g,, multiple-choice 
items), an optimally efficient adaptive item selection technique would 
have the effect of equalizing the effect of guessing on test scores 
throughout the ability range* 

The different item selection techniques of the various adaptive 
testing stratei;ies have been described by Weiss (1974), One of the most 
elegant of the adaptive strategies is a Bayesian sequential technique 
proposed by Oven (1969, 1975) and studied empirically by several investi- 
gators including Wood (1971), Urry (1971) and Jensema (1972), 

Oven's Bayesian Sequential Adaptive Testing Strategy 

Owen's technique is a general one for the sequential design and 
analysis of independent experiments with a dichotomous response* Its 
application in mental testing is to the problem of estimating ability by 
means of sequential selection, administration, and scoring of dichotomous 
test items* The mathematical details of the method arise from latent trait 
theory, with the item characteristic curves all assumed to take the form 
of the normal ogive* The properties of the normal ogive item characteristic 
function and its logistic approximation have been described by Lord £( 
Novick (1968) and Birnbaum (1968), respectively* 

Owen's procedure itivolves the individually tailored sequential design 
of a test by appropriate choice of available item parameters^ and estima- 
tion of ability (0) via a Bayesian-motivated approximation. At each step 
m in the ability estimation sequence a normal prior distribution on 0 is 
assumed, with parameters and a^, where m indicates the number of items 

already administered in the sequence. A test item to be administered at 
step m+l is selected so as to minimize a quadratic loss function on 

With no guessing (i*e*, a =0) and the discrimination parameters a constant 

g g 
over items, the appropriate item is the available one which minimizes the 
absolute value of the difference (fc^-y )* With a>0 the optimal difference 



^Each item g can be characterized by three parameters--a , , which 

& & & 

are, respectively, the item discriminating power, item difficulty, and item 
guessing parameter* The guessing parameter, c , is simply the probability 

of answering the item correctly by chance alone* 



6 



-2- 



Is somewhat negative; that Is, optimal difficulty Is somewhat "easier** than 
examlnee^s ability. 

Following Item administration at step m+l, the parameters y and 

xti m 

of the prior distribution are updated In accord with the examinee^s perfor- 
mance on the Item* In the case of a correct answer: 



Mel. 



and 



= varCojl) = 



Following a wrong answer; 



m 



and 



. = var(o[0) =- 



^ g m/ 



In Equations 1 through 4 (taken from Owen, 1975) 

4,{D) is the normal probability density function) 

^(D) is the cumulative normal distribution function, and 



. [2] 



[31 



[43 



[5] 



» C + (1-C„) * i-D) . 

g g 



[6] 



ERIC 



-3- 



The parameters y and of the Bayes posterior distribution on 0 are 

w+i m+i 

used as the parameters of the next step^s prior* At each step the prior 
distribution Is assumed to be normal* Testing may be terminated when 

m 

becomes arbitrarily small or when m becomes arbitrarily large, or when some 

other criterion has been reached* At termination the latest p Is the 

m 

estimator of 0, and Is a measure of the uncertainty of the estimate* 

m 

Urry (1971) and Jensema (1972, 1974) have Interpreted as the squared 

m 

standard error of eotlmate (S*E*E*) of 0^* Owen (1975) gives a theorem 

showing that as t?? <», p Q; that Is, the posterior mean Is a consistent 

m ' 

estimator of an examlnee^s ability* 

Practically speaking, of course, the number of Items administered will 
never approach infinity; but If the pool of available Items Is sufficiently 
large and appropriately constituted, will diminish rapidly, permitting 

valid estimation of 0 using a small number of Items* Urry (1971, 1974) has 
specified the requirements for a satlsftictory Item pool for Implementing 
Owen^s testing procedure and has shown In computer simulation studies that 
Owen^s sequential test can achieve In 3 to 30 Items the validity of a much 
longer conventional test, with the number of Items needed diminishing as 
Item discriminatory power Increased* 

Urry's (1971, 1974) and Jensema's (1972, 1974) monte carlo simulation 
studies of Owen^s Bayeslan testing strategy have evaluated Its merit solely 
In terms of the "fidelity" (or "validity")^ of the resulting ability estimates 
and the mean number of Items required to achieve any specified value of the 
fidelity coefficient* Although the fidelity coefficient Is of great Interest, 
Lord (1970, p* 152) has pointed out that evaluating an adaptive test by 
means of a group statistic such as the correlation coefficient presumes some 
knowledge of the group^s distribution on the trait being measured, and 
Ignores Information relevant to the accuracy or goodness of the ability 
estimates at any given level of the trait* 

The correlation of test scores with the simulated underlying ability Is 
only one criterion by which to evaluate a proposed adaptive testing strategy* 
Since the Bayeslan sequential test scores are actually estimates of underlying 
trait level. In the same metric, the accuracy of the estimates Is also of 
Interest* "Accuracy** refers to the closeness of the estimates to actual 
ability; It may vary systematically with ability level* Another Interesting 
property of estimates Is bias, or error of central tendency* Two kinds of 
bias should be of some concern; 1) unconditional bias, or group mean error 
of estimate; and 2) conditional bias* or mean error of estimate at a given 
level of the parameter being estimated* 



^By "validity" here Is meant the correlation of the ability estimates with 
actual ability* Green (1975) suggested use of the term "fidelity" In this 
context to denote validity coefficients obtained from monte carlo simulation 
studies* Greenes convention will be followed here* 



ERIC 



8 



-4- 



Purpose 

The purpose of the present paper is to report the results of a 
series of simulation studies designed to investigate the influence of 
guessing and item pool characteristics on the bias, accuracy, and other 
properties of the trait estimates derived from Owen^s Bayesian sequential 
testing strategy. 

The studies reported below were motivated by results obtained with 
live testing of Owen's strategy. Using Owen's testing strategy with 603 
college students and a 329'item pool of vocabulary knowledge test items, a 
correlation of .84 was obtained between estimated ability level and number 
of test items to termination. Simulation stu(^ies then were designed to 
investigate the influence of item pool characteristics on that unexpectedly 
large correlation. 

The simulation studies reported here were intended to explore both 
the properties of the Bayesian sequential testing method itself and 
properties of the resulting ability estimates. The former properties are 
investigated best by sampling from ''populations'' of simulated examinees 
whose distribution on the ability dimension approximates in form and 
param^ ters (mean, variance) the population assumed by the testing procedure — 
here, a i^rmal population with mean 0 and variance 1. The first three 
studies reported sampled examinees from such a population. These studies 
were designed to investigate the effects of guessing, of item discriminating 
power, and of two different test termination criteria on certain group 
statistics. The independent and dependent variables of interest in each 
study are described separately below* 

The fourth study focused on certain properties of the test scores 
as estimators of the ability underlying the item responses under varying 
conditions. This area of inquiry required sampling large numbers of 
examinees at regular intervals throughout the normal range of the trait. 
The details of this study are likewise described separately below* 



Study 1: An Ideal Item Pool with Variable Test Length 

Background and Purpose 

Jensema (1972) simulated Bayesian test administration to examinees 
sampled from a normal [0,1] distribution using two different "ideal" 
lOO'item pools* These pools were "ideal" according to Jensema's prescription 
that items for use in this testing strategy should have high discriminations 
and should be rectangularly distrib-ited in their dif ficulties^ The first 
pool had four items available at each of twenty-five equally spaced 
difficulty levels in the interval -2.45b52.4; all items had guessing 
parameters of a^*20 and discriminations of a=»8. A second item pool was 
identical to the first except for the value of the constant discrimination 
parameter, which was a^l*60* The Bayesian test was simulated as proposed by 
Owen (1969), with the parameters of the initial ability distribution set 
at [0,1] for each examinee. Testing terminated for eacn examinee whenever 
the posterior variance of the ability estimate diminished below a 



ERIC 



9 



-5- 



predetermined value or after thirty items, whichever occurred first* 
Jensema set t'.ie critical posterior variance value at »0625, which corres- 
ponds tc a standard error o£ estimate o£ *25, and hence to a fidelity 
coe££icient exceeding »968 (Jensema, 1972, p» 114)» Jensema's obtained 
fidelity coefficients and mean test lengths, obtained from simulations 
using random samples oC 100 examinees, are listed in Table 1* 



Table 1 

Mean Test Lengths and Obtained Fidelity Coefficients for 
Two Siimilated Bayesiait Sequential Tests, 
Distinguished by their item Discriminating Power (a) 
(frow Jensema, 1972) 







Mean 


Fidelity 


a 


Test Length 


CoeCCicient 


.80 


30* 


.93 


1.60 


17.5 


.97 



*No tests achieved the posterior variance termination 
criterion in this condition* 



Jensema (1972) did not report, however, some properties of the Bayesian 
sequential testing procedures which are of practical interest* The 
purpose of the present study was to replicate Jensema's research with 
these same two **ideal** item pools, while studying some other properties 
of the ability estimates in addition to fidelity and mean test lengths 

Method 

A Variables * Dependent variables were the individual ability estimates 

(0) and the number of items (k) required to satisfy the posterior variance 

termination criterion of i»0625» Independent variables were the simula;ed 

m 

examinees' abilities (0) and the discriminating power (a=-*80 or 1*60) of the 
items in the simulated item pool* 

Examinees' abilities were simulated by computer-generation of 100 
random numbers (0^) from a normal population with mean 0 and variance 1* 

The same 100 "exaainees** were tested with both item pools* 

Item pools * Two 100*item "ideal" item pools were simulated, 
corresponding to the ones used by Jensema (1972)* In each pool there were 
four items at each of twenty-five difficulty levels (b) equally spaced 
in the interval [-2*4<i'<+2*4] * The guessing parameter (c) was constant 
across items; for both pools, c=*20. The item pool for the first test had a 
constant discrimination parameter of a-=*80 across items; the second pool 
employed a constant item discrimination parameter equal to a=l*60* 



er|c 



10 



-6- 



Thus, for each test administration an item pool containing 100 distinct 
item'^ was simulated; each item g could be characterized by its parameters 

g g g 

Response generation and test administration ^ Item responses were 
simulated by calculating, for each item-examinee administration, the 
probability of a correct response to the item given the simulated ability 
(@^) and the item parameters a^, b^^ using equations presented by 

Betz & Weiss (1974) and Vale & Weiss (1975)* This probability P (0 ) was 

g i 

compared with a random number v^^ generated from a uniform distribution 

in the interval 10,1]. A score of 1 ("correct") for examinee i on item 

g was assigned if P (0^)^^ otherwise a score of 0 was assigned. 

g i gi 

Test administration was simulated exactly as proposed by Owen 
(1969)* For each examinee an initial ability 0^«O was assumed, and the 

prior distribution was assumed to be normal [0,1]. The optimal item in 

the pool wafi selected based on the item parameters, and its administration 

to the exaininee was simulated. Based on the item score (1 or 0), the 

parameter3 (u , c^) were updated, and another item was selected and 
m m 

administered. This recursive procedure was repeated until 30 items 

had h^'en taken by the "examinee", or until was smaller than .0625, 

m 

whichever occurred firsL. Once any particular item had been taken by 
the examinee it was not reused At test termination, the examinee^s 
simulated ability (0^), t.:e Bayesian estimate (^^)f ^nd the number of 

items taken (k) were recorded. 

Evaluative criteria . For each of the two test administrations, 
after all 100 examinees* tests were simulated, the following properties 
of the sequential test were estimated from the data; 

a. the bias, or mean algebraic error of the ability estimates; 

i»l 

b. the accuracy, or mean absolute error of the estimates; 

t=l 

c. Vq^9 the correlation of test length with ability; 

rgj^, the correlation of test length with estimated ability; 



11 



ERIC 



-7- 



d* r^^, the correlation o£ the algebraic error« of estimate 
(§j^-e^) with ability; 

rg^, the correlation of (e^^6^) with estimated ability; 

e, r^^t the fidelity coefficient; 

f. the mean, minitnum and maxitnum test length required to achieve 
the posterior variance termination criterion. 

Results 

Table 2 contains the results from Study 1. As Table 2 shows, there 
was positive bias (,06 and *05) in the group scores for both tests, 
indicating that ability was overestimated, on the average. Mean absolute 
error was ,26 for the a=,80 item pool and ,19 for the more discriminating 
item pool; in these data, then, the more discriminating item pool estiuiated 
ability with smaller average error. 



Table 2 

Properties of the Bayesian Sequential Test for Two Values of Item 
Discrimination, with Corrected Guessing and Ideal Item Pool 



Item Discrimination ja) 



Prop^irty 


.80 


1.60 


Test Length 






Mean 


30* 


18 


Minimum 


30 


12 


Maximum 


30 


30 


Errors of Estimate 






Mean (Bias) 


.06 


.05 


Mean Absolute Error 


.26 


.19 


Correlates 






> 


-.35 


-.40 


> 


-.07 


-.21 




** 


.84 


?^ 


** 


.85 


66 


.96 


.98 



^An arbitrary maximum test length of 30 items was imposed. 
**There was no variance on test length in the a^^BO test. 

However 6 and § correlated ,81 and *84 with posterior 
variance. 



Mean test length for the a-, 80 item pool was 30 items, with no 
variance, indicating that the posterior variance termination criterion 
never was reached using this item pool. The liigher discriminating pool 



ERLC 



12 



-8- 



(a«l*60) required a mean test length of 18 Items, with a range of from 12 
to 30* For this Item pool test length correlated ,84 and .85 with ability 
and the ability estimator, respectively. This strong positive correlation 
was essentially the same as was found In the live-testing results* It 
Indicates that despite the "Ideal" construction of the Item pool, the test 
required substantially larger numbers of Items to achieve the termination 
criterion as ability Increased* (Since there was no variance In test 
length for the a'=*80 Item pool, the test length correlations cannot be 
evaluated under that Item pool configuration*) 

Errors of estimate (§^^"6^) correlated -,35 and -*40 with ability 

for the two Item pools, which could Indicate a tendency to underestimate 
ability at high levels and to overestimate i.t at low levels* This, of 
course. Is a phenomenon typical of regression estlmatesj the Bayeslan 
test scores seem to be acting like regression estimates In this regard* 
This same tendency was evident to a smaller extent In the correlations 
between errors and ability estimates (^^g^)* 

The fidelity coefficients (r^g) were ,96 and *98, respectively, 

for the (3F*80 and a=l*60 Item pools* These were slightly higher than those 
obtained by Jensema (see Table 1)* The differences are likely due to 
random fluctuations resi^ltlng from the relatively small sample size of 
100 simulated testees (see Betz & Weiss, 1974, pp. 20-21 and 24-25), 

Conclusions 

The replication of Jensema's study of the Bayeslan sequential 
test using these two Item pools corroborated his findings with regard to 
fidelity and mean test length* The fidelity coefficients obtained In the 
present study were slightly higher than hls^, while mean test lengths 
were almost Identical, It seems clear that Onen's adaptive testing procedure 
has the potential of achieving measurement of high fidelity with relatively 
short tests* However, the strong correlation between ability and test 
length suggests a potential problem If the Bayeslan test Is used In a 
group of higher ability than Is assumed beforehand* Additionally, the 
overall positive bias of the trait estimates suggests that additional 
study of the testing procedure is required before Its scores are used 
directly as estimators of ability* However, the generality of the results 
of Study 1 Is limited to "ideal" Item pools with rectangular distributions 
of the difficulty parameters and with the same discrimination and guessing 
parameters as In the present study* 



Study 2: Effects of Guessing and Item Discrimination 
in a Perfect Item Pool 

Backf^round and Purpose 

The discovery In Study 1 of positive bias in the Bayeslan trait 
estlm'^tes, and of a strong positive correlation between ability and test 



ERLC 



13 



-9- 



length in the a"!. 60 item pool, raises the question of the generalizability 
of these phenomena. These results might be due to sampling fluctuations, 
to the specific item parameters employed, to the effects of random guessing, 
or to characteristics inherent in Owen's sequential testing procedure. 
Study 2 vas designed to test the generality of the results of Study 1. 

In Study 2 many sequential tests were simulated by varying the 
discriminating power of the item pool and the effect of guessing. 
Further, in order to avoid loss of generality due to a specific range 
of the distribution of item difficulty values in the item pool. Study 2 
simulated a "perfect" item pool — one behaving as though it con t a ine d an 
unlimited number of items at any specifiable difficulty level. The results 
of Study 2, therefore, should reflect the best attainable results under the 
Bayesian procedure, given the guessing and discrimination parameters of 
the items. 

To evaluate the effects of guessing on testing strategy characteristics, 
test administration was simulated under the three different guessing condi- 
tions described below-^o guessing, uncorrected guessing, and corrected 
guessing. Under each of these conditions fourteen "perfect" item pools 
were simulated. These differed from one another only in their item discrim- 
inating powers. Thus, fourteen values of a were used; a was constant within 
any test simulation, but varied across tests. The same properties of the 
test procedure studied in Study 1 were of interest in Study 2. 

Method 

Variables . Dependent variables in Study 2 were the same as in Study 
1: ability estimates (0) and test length (k)* Independent variables were 
simulated ability (6), discriminating power of the item pool, and the 
effect of guessing and of scoring for guessing. 

To study the effect of guessing, three different conditions were simu- 
lated: 

1. No guessing; in the item response model, c was set to 0, 
and was assumed to be zero in the Bayesian scoring formulae 
(Equations 1 through 4). 

2* Uncorrected guessing; c was set to .20 in the item response 
model, but was assumed to be zero in the Bayesian scoring 
formulae * 

3. Corrected guessing; c was set to .20 in both the item 
response model and the Bayesian scoring formulae. 

Under each guessing condition, fourteen test administrations were 
simulated^ These differed only in the constant value of the item discrimi- 
nating powers in the respective item pools. The fourteen values used were 
cr- .5, .6, .7, .8, .9, 1.0, 1.25, 1.50, 1.75, 2.00, 2.25, 2.50, 2.75, and 
3,00. For each test administration, the same 100 simulated ability values 
used in Study 1 constituted the examinee "group". 



ERLC 



14 



-10- 



Item pools . The "perfect" item pools were simulated by calculating, 
for each examinee after each item response was scored^ the optimal diffi- 
culty value of the next item, given a , a and the current ability 

g S 

estimate. This optimal item difficulty was determined using a formula 

given by Birnbaum (1968, p. 464) for calculating the difficulty level at 

which maximal item information occurs, given a. and assuming that 

§,"0,. With a constant and when no guessing is assumed {c «0 in the 
i i 8 ^8 

scoring formula), the optimal item is one with b.^^Q . When guecsing is 

mxi m 

assumed^ the optimal difficulty (b^^ is smaller than 5^, by an amount 
which is inversely proportional to a^. 

After the "optimal*' item difficulty value was calculated, the 
computer simulation program generated a hypothetical item with that 
difficulty value, then ^^administered" it to the examinee. Thus, the 
hypothetical item pool literally had available an unlimited number of 
items of any difficulty value specified by the sequential testing 
procedure. 

Response generation and test administration . Item responses were 

simulated in the same manner described in Study 1. Test administration 

was identical with Study 1> except for the item difficulty generation 

procedure. The same posterior variance criterion (0^^.0625) was used as 

m 

a test termination rule. Unlike Study 1, test length was free to exceed 
30 items; a maximum length of 100 it^s was imposed. At test termination, 
ability (0^), the ability estimate (9^)> and the number of items adminis^ 

tered {k) were recorded for each examinee. 

Analysis . A total of 42 test administration conditions were simu^ 
lated — 14 "item pools" under each of the three guessing conditions. For 
each test administration, the same sequential test properties estimated 
in Study 1 were estimated: bias, mean absolute error, r^^^ r^^^ r^^, 

^Se* ^OS' ^^Tv and range of test length. 

Results 

No-guessing condition . As Table 3 shows, test length was constant 
within item discrimination level under no-guessing, and diminished 
inversely with level of item discrimination. The posterior variance 
termination criterion was reached for all examinees using every item pool 
except the one having a^.50. As a point of comparison with Study 1, test 
termination was achieved in fewer than 30 items for item pools having 
ail. 00. There was no corrfiation between test length (k) and O or 5, 
since there was no variance in test length for any test administration. 

The overall bias of estimate under the no-guessing condition was 
practically zero for all but the highly discriminating item pools (see 
Table 3 and Figure 1). Mean absolute error was .17 for a". 5 and increased 

15 



ERIC 



fairly steadily to .22 for the a'"3*00 item pool* For the no-guessing 
cop'^ition* then> there is a tendency for the highly discriminating item 



T*ble 3 

T«at Lcnfch^ Mean Errors of t«cln«t«» and CorreUcc* of Ability (9) and T««t S<or« (Q) 
M a Fiintclon of icea DUcrlvdnatlon ta) in ch« Pcrftct Ices Pool* ulch No Gucfslng 



Icea Diicri«in>tion (rt> 



Prop«rcy 


.5 


• 6 


• 7 


♦ 8 


.9 


1.0 


1.25 


1.5 


1.75 


2.0 


2.25 


2.5 


2^75 


3.0 


T««t L«n|ch 






























K«aii 


100 


71 


52 


A! 


^3 


27 


18 


13 


11 


9 


7 


7 


6 


5 


HiniMM 


100 


71 




A! 


33 


27 


la 


13 


11 


9 


7 


7 


6 


5 


KaxlBua 


100 


71 


52 


41 


?1 


27 


18 


13 


11 


9 


7 


7 


6 


5 


error* of ^ciMC« 






























Mean (BIm) 


.00 


-♦0! 


.02 


.01 


.00 


.01 


.00 


.oi 


.04 


.06 


♦ 04 


.05 


.03 


.04 


H«an Ab«oIuc« Error 


• 17 


♦ U 


.19 


.19 


.18 


♦ 19 


.la 


.21 


.20 


.21 


♦ 21 


.20 


.21 


.22 


Correl«c«3* 






























Uich Error 
































-35 


-♦27 


-♦31 


-.36 


-.39 


-.35 


-.37 


-.37 


-.30 


-♦37 


-.39 


-.36 


-.32 


-.35 




-•17 


-•oa 


-•iO 


-.16 


-.20 


-.15 


-.i7 


-.U 


-.07 


-.15 


-.16 


-.14 


-.09 


-.10 


^id«Iicy (validity) 
































.9a 


.9a 


.9a 


♦ 98 


.9a 


.9a 


.98 


.97 


.97 


.97 


.97 


.97 


.97 


.97 



*Corr«lacion« uich l«ngch (r^. and i^*) u«r« noc compuced since t«sc l«n|ch (k) v*s conictnt. 



pools to yield larger average errors than the inoderately discriminating 
item pools* 



Figure 1 

Bias and Mean Absolute Error as a Function of It6m 
Discriminationst for the Perfect Item Pool with Ko Guessing 



.40- 



.30 



g .20 

Z 
< 
Uf 

^ AO- 



mean absolute «rror 1^1 




bias 



5 to 15 2D 

DISCRIMINATION (a) 



25 



3D 



As in Study 1, errors of estimate (©^^Q^) correlated negatively with 

0 (-*27 to -*39) and with 0 (-*08 to -*20)* Again* these correlations 
suggest a regression effect* 

16 



-12- 



The fidelity coefficients were all or .98, as "predicted" by the 
posterior variance termination criterion value. Interestingly, the lower 
fidelity coefficients occurred at the higher item pool discrimination 
values* 



Table 4 

Tent Length* Ham Errors of Estimate, and Correlates of Ability (0) .ind Te.tt Score W 
a$ a Function of Item Olscrltntnatlon (a) tn the Perfect Item Pool* with Imcorrcctod Ouesslnf^ 



Item Dl^crlnlnatlon M 



Property 


■ 


.6 


,7 


.8 


A 


1.6 




1.5 


i.75 


2,0 


2.25 


2.5 


.>.75 


3. a 


Te^At t^nftth 






























Mean 


too 


71 


52 


41 


33 


27 


18 


13 


n 


9 


7 


7 


6 


5 


Minlmuiq 


100 


71 


52 


41 


33 


27 


18 


13 


n 




7 


7 


6 


5 


Maximum 


100 


71 


52 


41 


33 


27 


IS 


n 


n 


*> 


7 


7 




5 


Errors of C^tlnat^ 






























Menn <Blas> 








.42 


.37 






.27 


.29 




.12 


.11 


.29 


.29 


Ht*n Absolute Error 


.58 


.48 


.43 


.46 


.42 


,39 


.37 


.37 


.36 


.AO 


.39 


.38 


.37 


.39 


Correlate** 






























Ultb Error 






























% 


-.51 


-.46 


-,49 


-.4A 


-.48 


-.43 


-.44 


-,3fi 




-.11 


-. 52 


-.12 


-.12 


-.32 




-.29 


-.23 


-.23 


-.19 


-.20 


-.13 


-.16 


-.04 


-.01 


* J 






.07 


.02 


FMrlUy (vnltdtcy) 
































.97 


.97 


.96 


.95 


.95 


.9^ 


.96 


.94 


.9^ 




.91 


.91 




M 


^Correlations vltb 


teat 


lengtlT C 




were 


not t^onpated since test 






Vim con^tanti 





Uncorrected -guessing condition * As Table A shows^ the tesJ U*ngtli data 
were identical with those obtained under the no-^guessing condition* Table 4 
and Figure 2 show that both mean algebraic errors (bia^^) and ah^^olute errors 
were quite high (*57, *58) for the a=*50 item pool and decreased as a in- 
increased, to about a=l*25. For a>1.25 the mean errors seemed to level 
off> with moderately large values for both bias and absolute error. 



Figure 2 

Bias aTvd Mean Absolute Error as a FUTvction of Icom 
Discriminations* for the Perfect Item Pool wlfh 
Uncorrected fUiestsing 



.60- 




1 

10 



15 20 
DISCRIMINATION (al 



25 



30 



ERIC 



17 



-13- 



As before, errors of estimate correlated negatively with ability; the 
magnitude of the correlations were large for a"*50, then decreased as a 
increased, until approaching a constant value at a>l*75* Again, these 
correlations suggest a regression effect* The correlations of errors with 
ability estimates, r-^, followed a different trend under this condition 

than was seen previously; rg^ was -,29 for a»*50, then showed a steady 

algebraic increase with a, to a value of *07 at a"2*75* 

Fidelity coefficient values were everywhere lower with uncorrected 
guessing than with corrected guessing, and decreased steadily from *97 to 
»91 as a increased* As expected, fidelity increased with test length* 



Figure 3 

Number of Items to Termination^ with .20 Cuesslng 

•I 



MEAN 

MINIMUM 



'Mini 

^5 i5 15 20 2H ao 

OlSCRIMINATION (9) 

Corrected-guessing condition . Figure 3 graphically depicts test 
length as a function of item discriminatory power (a). The vertical bars 
in Figure 3 indicate the range of test length at a given a-level; the dot 
indicates the mean test length for that level* As Table 5 and Figure 3 
show, some variance in test length was present for all a levels except 
a=*50 (where the termination criterion never was reached). Mean test length 
to termination varied inversely with item discrimination, as In the other 
conditions. Even with this perfect item pool, the termination criterion 
was achieved in fewer than 30 items only for a>1.00. 

As Figure 4 shows, the bias of estimate was small but positive under 
the corrected guessing condition, increasing to meaningful levels only as 
item pool discrimination exceeded a»2.25. Mean absolute error was almost 
constant across levels of a. 



100 



80 



f: 60 

o 

z 

y 



20 



er|c 



18 



T*ble 5 

Teat Lengthy Henn Errors or E^tlraate, and Correlates of Abfllcy (0) nnd Test Score <6) 
A3 « Function of Icea Disc rtminat Ion ia) in the Perfect Itcn Poolt with Corrected Gtiea»ing 



Item Placriroinatlon (d) 



Property 


.5 


,6 


-7 


.8 


.9 


1.0 


1.25 


1.5 


1.75 


2.0 


2.25 


2.5 


2.75 


3.0 


T^st Length 






























Hcan 


100 


59 


77 


60 


48 


^0 


27 


20 


16 


13 


11 


10 


9 


9 


Hininmn 


100 




66 


52 


^2 


33 


21 


14 




8 


7 


6 


6 


5 


Haxlanin 


m 


100 


88 


69 


57 


49 


32 






la 


In 
to 


10 


1 e 
If 


tit 


Errors of Cstiinite 






























Hcan <Bi«a> 




.03 


.02 


.03 


4 02 


^04 


.01 


.01 


.01 


.02 




.06 


.07 


.08 


Me«n Absolute Error 


.22 


.18 


.16 


.18 


.19 


.19 


-16 


.17 


*I9 


.20 


.18 


.20 


.19 


.21 


Correlates 






























With Error 
































-.39 


-.36 


-.25 


-.39 


-.42 


-.35 


-.37 


-.37 


-.38 


-.39 


-.25 


-.37 


-.33 


-.33 




-.17 


-.18 


-.09 


-.20 


-.23 


-.16 


-.19 


-.18 


-.18 


-.19 




-.14 


-.10 


-.09 


Utth Tesc Length 






























'ok 


* * ** 




.ao 


.78 


.78 


.81 


.81 


.82 


.85 


.88 


.95 


.88 


.90 


.88 


••ok 


* * ** 




.82 


.81 


.80 


.83 


.82 


.84 


.87 


.89 


.86 


,90 


,91 


.90 


Ftdeltcy (validity) 


































.98 


.99 


.98 


,98 




,98 


.98 


.98 


.98 


.9B 


.97 


.97 


.97 


*CorreUtlons not com(>uted ^ince test length (,k) 


WAS constant 

















As was seen in Study 1, test length correlated strongly with ability 
(and ability estimates) where it was free to vary (Table 5)* Since test 
termination takes place only after a specified reduction of the posterior 
variance has occurred, the large positive r^j^ correlations indicate that 

the rate of posterior variance reduction is a function of ability level, 
with more rapid reduction taking place as ability (Q) decreases^ 



Figure ^ 

Bias and Mean Absolute Etrot as a Function of item 
Discriminations^ for the Perfect Item *^ool with 
Cortected Guessing 



>40 



.30- 



UJ 



z 
< 

UJ 

2 



.10 • 



mean absolute error 




bias (&-^) 



—I 1 J ^ 

to 15 20 25 

DISCRIMINATION (a) 



19 



— I — 

3.0 



-15- 



As seen under the other conditions, Table 5 shows that errors of 
estimate correlated negatively (-*25 to -*42) with ability and with ability 
estimates (-*09 to --23)» As in the no-guessing condition, all fidelity 
coefficients were .97 or »98, with the lower value occurring at the higher 
item discrimination levels* 

Conclusions 

Study 2 supports the findings of Study 1 and extends them somewhat* 
As in Study 1, the Bayesian testing strategy resulted in very high fidelity 
coefficients with relatively short tests, provided the item discriminating 
powers were 1*0 or greater* The Study 1 finding of positive overall bias 
of estimate was corroborated here; Only one of the forty-two bias estimates 
was negative* Especially noteworthy was the effect of uncorrected guessing 
on both the ability estimates and the fidelity coefficients; Bias was 
severe, and fidelity actually decreased as discriminating power increased. 

Under the corrected-guessing condition, the finding of a strong 
positive correlation between test length and 0 or 0 was replicated consis- 
tently* It is important to note that this condition was obtained under 
conditions of a "perfect" item pool; this implies that the high correlation 
does not result from inadequacies of the item pool* Since there was no 
variance in test length when no guessing was assumed (i*e*, for the no- 
guessing and uncorrected-guessing conditions), the phenomenon would seem 
to be due to the scoring formulae in some way* The phenomenon by itself 
is of little concern unless it results in different measurement properties 
at different levels of ability* This may be the case; some of the proper- 
ties of the sequential test seem to improve with test length* If test 
length is consistently greater as ability increases, then the test may be 
measuring less well as ability decreases, due simply to the effects of test 
length* 



Study 3; Effects of Fixed Test Length 



Background and Purpose 

The results of Study 2 make it obvious that with guessing a factor, 
test length increases with ability level when the posterior variance cri- 
terion is used to terminate testing* It was suggested that some measure- 
ment properties of the test may suffer as a consequence* Tt/o properties 
which seem to be affected adversely by short test length are bias and mean 
absolute error, both of which increased as item discrimination became very 
high (and test length very short) in the no-guessing and corrected-guessing 
conditions (see Tables 3 and 5)* Another property which should be 
adversely affected by very short test lengths is fidelity* Study 2 noted 
a small but consistent decline in fidelity at the very high discrimination 
levels (see Tables 3, 4 and 5)* Additionally, Jensema (1972) noted a 
similar phenomenon, which he termed "correlation drop-off'** 



ERLC 



This study explored the effect of administering the same number of items 
to all examinees, on the same properties which were of interest in Studies 
1 and 2* This was done by means of simulating fixed-length Bayesian tests 
for the corrected-guessing condition, under various item discrimination 

20 



-16- 



levels. To avoid loss of generality, the ''perfect'' item pool was again 
employed. 

Method 

Variables , Dependent variables were the ability estimates (0) and the 
posterior variance (a^) after a fixed number {k) of items had been adminis- 
tered. Independent variables were simulated ability (0) and item discrimi- 
nating power. Nine levels of discriminating power were studied: a =,6, 

,8, 1,0, 1,25, 1,50, 1,75, 2,0, 2,5, 3,0, Examinees were the same 100 
simulated ability values (6^, i"l, 2, ,,, 100) used in Studies 1 and 2, 

Item pools , "Perfect" item pools were simulated, as described in 
Study 2; i,e,, the locally optimum item difficulty was calculated after each 
item response, and an item having that difficulty level was artificially 
generated and administered. 

Response generation and test administration . Item responses were simu- 
lated in the same manner as in Studies 1 and 2, Test administration was 
identical with Study 2, except that all "examinees" were administered 30 
items. After 30 items, the individual ability (@^), the estimate (0^), and 

the posterior variance (^3q) were recorded for each examinee. 

Analysis . A total of nine test administrations were simulated (one at 
each item discrimination level). For each administration thet^e sequential 
test properties were estimated as described in Study 1: bias, mean absolute 

error, , r a f and v^^* Additionally, for each administration, the corre- 

Qe &e 00 

lations of the posterior variance with 0 and 0 were calculated. 
Results 

Table 6 and Figure 5 contain the results of Study 3, To facilitate 
comparing the 30-item test length with the posterior variance termination 
criterion, comparable data from Study 2 are included in Figure 5, 

As Figure 5 shows, the overall bias of estimate was virtually zero in 
all item pools, except for the a'^^SO a.^d a'2,5 item pools. Mean absolute 
error decreased steadily as a function of a, and was lower for fixed test 
length than for the variable test length conditions^for all discriminations 
larger than a»l,50. As In Studies 1 and 2, error (9^-0^) correlated ^ 

negatively with 0 and G, suggesting a regression effect. 

As Table 6 shows, the posterior variance correlated positively with 
e and §, with the magnitude of the correlation generally diminishing as 
a increased (e*g*, 2 was ,86 for a«,6, and ,74 for a-3,0). This trend 

corresponds to the one seen in Studies 1 and 2 — test length correlates 
strongly with ability when posterior variance is held constant. 



ERIC 



21 



-17- 



q: 
u 

7L 
< 



/to 



.30 



.20 



.10 



Figure 5 

Hefln Absolute Error mnd Bias for TWo Different 
Test Termination Criteria 



* COHSTant test LEMGTH (XtTEkS) 




mean absdute error 



mean absolute ^rot fA-«f 



bias 



UO 15 20 

OlSCRtMINATION W 



— I — 
2S> 



— I — 

3U0 



The fidelity coefficients increased with the item discriminating 
power, from .93 at a-. 60 to .99 at a-1.5 and higher. 



Table 6 

Errors of Estimate and Correlates of the Bayesian Sequential Test Ability 
Estimates as a Function of Item Discrimination, for 30-Item Test Length 
and Corrected Guessing, with Perfect Item Pool 

Item Discrimination (a) 



Property .6 .8 1.0 1.25 1.5 1.75 2.0 2.5 2.75 

Errors of Estimate 

Mean (Bias) .09 .01 -.01 .02 -.01 .00 .01 .04 .01 

Mean Absolute Error .33 .28 .21 .17 .15 .12 .12 .12 .09 

Correlates 
With Error 

^ee -.41 -.30 -.36 -.34 -.40 -.32 .32 -.51 -.36 

^Oe -.04 .01 -.13 -.15 -.24 -.19 -.18 -.36 -.23 
With Posterior Variance 



^0^2 .86 .85 .89 .81 .82 .77 .69 .76 .74 

m 

^§o2 .93 .90 .90 .84 .82 .79 .69 .72 .73 

m 

Fidelity 

^60 .93 .95 .97 .98 .99 .99 .99 .99 .99 



ERIC 



22 



Conclusions 



It is apparent that some improvement in the properties of the 
Bayesian testing procedure can be realized by setting test length constant, 
provided that item discriminatory power is sufficiently high (e*g*, 
greater than a^l*5)* Bias seems to be diminished, and absolute error 
decreases as discrimination increases* 



Study 4; Effects of Ability Level 
and Item Pool Configuration 

Background and Pupose 

Simulation studies of Owen^s Bayesian sequential test procedure 
typically have concentrated their attention on group statistics* For 
example, Urry (1971, 1974) and Jensema (1972, 1974) evaluate their results 
in terms of fidelity coefficients and mean test length (using a posterior 
variance termination criterion)* Studies 1, 2, and 3 above have extended 
Urry's and Jensema 's work by examining additional properties of the sequen- 
tial testing procedure, but they also concentrate on group statistics* 
With any group statistic, such as a fidelity coefficient, a bias estimate, 
or a mean test length, there is a lack of invariance across groups* A 
change in the shape of the distribution, or the central tendency and varia- 
bility, may alter the magnitude of the group statistic markedly* Therefore, 
some distribution-free methods for evaluating the Bayesian sequential 
adaptive test are needed* One general method for this is to examine char- 
acteristics of the test as a function of ability level* 

Given that some properties are to be evaluated as a function of 
ability level, it is necessary to select the properties of interest* Tlie 
results of Studies 1, 2, and 3 suggest some characteristics of Owen's 
procedure which bear further investigation* For instance, there was a 
tendency in the preceding studies for positive bias to occur, i*e*, for 
the group average ability estimates to be larger than the average ability* 
Additionally, there was consistently a moderate negative correlation 
between ability and the errors of estimate, indicating a regression effect* 
The negative correlation between the estimates themselves and their error 
further suggests that the regression may be non-liuear* The strong positive 
correlation between test length and ability indicates that the posterior 
variance estimate is being reducaa nwre rapidly at low ability levels than 
at high ones, despite the use o£ the "perfect" item pools and the presence 
of constant item discrimination across all difficulty levels* 

Based on the findings of Studies 1» 2, and 3, the present study 
examined appropriate properties of the Bayesian sequential testing strat- 
egy as a function of ability level* These properties include the form of 
the regression of ability estimates on 0, the conditional bias of the 
ability estimates, and mean test length* In addition, this study included 
estimation of the "information" (Birnbaum, 1968) in the Bayesian test 
ability estimates at various levels of ability* 



In addition to estimating the regression, bias and information in the 
Bayesian test scores as a function of ability, this study examined the 
effect which different item pool configurations** might have on these 
properties* Item pool configuration here refers to the regression of item 
discrimination (a) values on the item difficulty (fc) values in the item 
pool* Studies 1, 2, and 3 above, and all previous research using **ideal** 
item pools, have simulated item pools in which was constant across items 
or in which cl was statistically independent of The presence of no 
statistical association between a and b implies that the same item infor- 
mation (Birnbaum, 1968, P* 449) is available at all levels of item 
difficulty* On the other hand, if there is a statistical relationship 
between the discrimination and difficulty values of the items in a given 
item pool, there will be' more information available in some ranges of the 
ability continuum than there is in others* 

Although in theory it is desirable for adaptive testing to assemble 
an item pool having equally discriminating items at all the difficulty 
levels represented, in practice this has not always been achieved* For 
instance, the 58~item pool used by Jensema (1972) to simulate adaptive 
testing based on some items from the Washington Pro-College examinations 
had very highly ditjcriminating items in its upper difficulty ranges and 
low*to-moderately discriminating items in the easy range of difficulty* 
Similarly, Lord (1974) reported that the discrimination parameters of his 
item pool correlated positively with the difficulty parameters* Practical 
implementations of adaptive testing are likely to use item pools in which 
the configuration of the item parameters is less than ideal* Therefore, 
the effects of different item pool configurations on the psychometric 
characteristics of the test scores (or trait estimates) need to be inves** 
tigated* 

This study investigated three different conf iguratioixs of the item 
pools* Each configuration was characterized by a different slope of the 
regression of item discrimination parameters on item difficulty, which in 
turn can be characterized approximately in terms of the correlation, v^^f 

between item discriminating power and difficulty* Identical test simulation 
studies were conducted under all three configurations in order to evaluate 
any differential effects* 

Method 

Variables * Dependent variables were the ability estimates (§) and the 
number of items {k) required to satisfy the test termination criterion* 
Independent variables were the simulated examinees* abilities (0^) and the 

configuration of the simulated item pool* Examinees' abilities for each 
test administration were simulated by 3100 values of 0^, 100 at each of 31 

equally spaced levels in the interval [-3*Oi0i+3*O] * This examinee distri- 
bution was used because of the need for relatively large numbers of obser** 
vations at each level of 0 in order to estimate accurately the regression 
of ability estimates on ability, the conditional bias, and the information 
curves* 



24 



-20- 



Item pools ^ Three "perfect** item pools were simulated — one for each 
configuration* The three configurations studied included one with a 
moderate positive correlation of a with b (referred to hereinafter as 

one with a moderate negative correlation (^^j^")* ^"d one with no 

correlation (I'^jjO)* The r^^+ configuration favored the more difficult 

items with higher discriminating powers, the r^^- configuration favored the 

easier items, and the ^^^^ configuration favored no difficulty levels* 

As in Studies 2 and 3, after each item response the optimal difficulty 
of the next item to administer was calculated, and an item having that 
difficulty value was artificially generated and administered* In the 
previous studies, the optimal difficulty calculation was based on the 
guessing parameter (a) and on the constant discrimination parameter (a) of 
the items in the pool* In this study, the same calculation was based on 
the man item discrimination parameter (3), which was 1.25 for all configu- 
rations* In all cases, a was *20* 

The item pool configuration was simulated by; 

1* Selecting the appropriate b for the next item from the 

S 

perfect item pool as though all were equal to a^; call 
this &Y^^g'^m' 

2* Calculating a conditional value from a linear transform 
of b* 

g 

where S.D.^ is the standard deviation of the parameters 
in the simulated pool; 

ff*P*_ is the standard deviation of the b parameters in the 
Simulated pool; 

f b* f ^ uf are as previously defined; 
g g ao g 

3* Adding an error component, e , to the approximate a , so that 
for each item administered ^*g"^gl^*g'^g 

where a* is the simulated discriminating power of the item; 

a lb* is the approximate discrimination defined above; 
g g 



25 



-21- 



is a random number from a normal {0, a^^] population, 
such that 

4* Setting a*g equal to ,80 whenever it would otherwise have a 
lower value* 

Response generation and test administration . Item responses were 
simulated In the same manner described In Study 1. Test administration was 
Identical with Study 1. A posterior variance termination criterion of 
0^^*0625 was used, with an arbitrary maximum test length of 30 Items* The 

corrected-guesslng condition was used* At termination, the ability (0^), 

Its estimate (0^), and the number of Items administered (k) were recorded 

for each examinee* 

Analysis * For each of the three simulated test administrations, the 
following properties of the sequential test were estimated from the 100 
observations at each separate ability level (0^); 

a* the conditional mean, ^il^i^ioO^^l ^^^^ 

2 1 a * 

b* the conditional variance, ^'g^jg *100^^ l"®l^ ^^^^ 
c* the conditional bias, 0^=0^-6^ [13] 
d* the conditional mean test length, ^|0^* 

The regression of the trait estimates (0) on ability (o) was estimated 
by fitting a third degree polynomial to the 31 conditional means, using a 
least squares method* The regressions of bias and test length on q were 
estimated graphically* 



The Information In a set of test scores (x) can be defined as 

2 

[W] 



I (0) 

X 



^(g(xl0) 



^|0 



ERIC 



The "information" value of test scores at any level of ability Is an index 
of i.he usefulness of those scores for discriminating among examinees In the 
vicinity of that level* A zero Information value Indicates that the test 
scores are useless for making discriminations about a given point; an 
infinite Information value Indicates that error-free discriminations can be 
made about that point on the basis of the test scores* Any value between 
the two extremes has implications for the probability of making Type I and 
Type II errors In classifying persons above or below the point In question* 

26 



The numerator in Equation 14 is the first partial derivative o£ the 
function describing the regression o£ test scores (x) on the trait (0). 
The denominator in Equation 14 is the conditional standard deviation oC the 
scores. The regression oC test scores on 0 can be approximated Crom 
empirical data, i£ the scores (x) and the latent trait values (0) are known. 

Since the Bayesian trait estimates (0) can be treated as test scores, 
the numerator oC the information function can be evaluated at any point (0') 
Crom the slope oC the equation Cor the regression oC 6 on Q, That equation 
was calculated Crom the simulation data as described above. In estimating 
the inCormation curves, the Cirst partial derivative (i^e,, the slope) oC 
that polynomial equation was evaluated at each oC the 31 & points used in 
the study. The denominator oC the inCormation Cunction at each oC the 
same 31 points was estimated by the square root oC the conditional variance 
oC the trait estimates at that point. 



Figure 6 

Mean Esclmaced Ability (0) at 31 Ability Polnrs (^) 
for the Simulated Bayesian Sequential Test under 
Three Item Pool Configurations 




4 



Thus Cor each oC 31 points 0^, the inCormation at that point, -^^(0') 
was estimated Crom the test simulation data, as 



27 



"23- 



el©' 



[15] 



where ff(0|0') is the third degree polynomial regression fitted to 
the 31 test score means 

o{Q\Q^) is the square root of the observed variance of the 
100 test scores at . 



Results 



Regression of 9 on 0 ^ Figure 6 is a plot of the observed mean ability 

estimates (5) as a function of actual trait level (6) differentiated by item 

Vcol configuration; Appendix Table A'-l shows the nuinerical values of these 

ineans. For each configuration, then, Figure 6 contains the graphic empirical 

approximation of the regression of 0 on 0* The values for each item pool 

configuration form an essentially linear plot for levels of 6 between +1 and 

-1) with a tendency toward departure from linearity for values of 0 larger 

than +1 and smaller than -1* High abilities are underestimated; low abilities 

are overestimated* The exaggeration of this effect seems strongest for the 

r ~ configuration, in which the average item discrimination increased as the 
ab 

ability estimates decreased* 



BO 



.60 



40 

^ .20 

z 

< 

g 0 
S -.20 
-AO 



-.60 
-.60 



Figure 7 

Mean Error of Estimate (S-O) at 31 Ability Points (6) 
for the Sionilated Bayeslan Sequential Test under 
Three Item Pool Configurations 



* + 



+ + 



t*+ 

I 



+ + 



fiTt-frr;;-;; 

• • * * ^ 



'3^ -20 -to 0 10 20^ 

ABILITY W 



30 



Bias* Figure 7 contains the plot of conditional bias (inean (0-^0)) on 
ability (numerical values are in Appendix Table A-1 as e)* For each 



ERIC 



28 



-24- 



conflguratlon, the curve described by thest data Is non-linear* As Figure 6 
showed indirectly, the conditional bias for all three configurations was 
close to zero for -liOil, but It Increased with Increases In absolute values 
of e elsewhere* A strong tendency to underestimate high 0 was present In 
all three configurations, and was severe for ^^^-f for which the bias was 

-♦43 at 0»3»O» The tendency to overestimate low © was even more pronounced, 

and was severe for all three Item pool configurations* For the i* ,.0 

ab 

configuration the conditional bias at 0»-3 was *53; for r the bias at 

ab 

the same point was *61* If the 6 metric Is expressed In population standard 
deviation units, then, the Bayeslan sequential test estimates may typically 
err by one-half standard deviation unit at low extremes of the ability range 
and by a lesser but still significant amount at the high extremes* Further- 
more, this tendency Is systematically affected by the configuration of the 
Item pool* 



Figure 8 

Mean tfumber of Items to Termination (Test length) at 31 
Ability Points (O) for the Simulated Bayesian Sequential 
Test under Three Item Pool Configurations 



30 



O 
Z 

W 25 



< 20 



15- 



+ 4 



+ + 



4 + 



+ ++ + 



+ 




9 




m 





; 1 1 1 1 1 1 — 

-3.0 -2jO -1j0 0 to 20 30 

ABILITY (♦) 



Figure 8 contains plots of mean test length as a function of ability 

level for each Item pool configuration (numerical values are In Appendix 

Table For the P *0 configuration, test length was constant at 30 

ab 

Items, the arbitrary maxlmuTD* For i^^j^+f where the most discriminating Items 

were available at the higher difficulty levels, test length was constant at 

30 Items for 0 levels less than *6, then declined gradually to a mean of 23 

Items at (K3. The r configuration, which had higher Item discrimination 

ab 

at the lower difficulty levels, showed a trend opposite that for P^j^+* For 



ERIC 



29 



-25- 



^ab"' ^^^^ length la":reased rapidly with 0 from a mean of 14 Items at 

0*-3, to 30 Items at G-O; for all 9 greater than zero, the test length was 
30 Items, the arbitrary maximum. 

Figure 8 Illustrates two Interesting trends. First, not only did the 
r^lj" configuration use fewer Items than the others, but the rate of Increase 

as 0 Increased Is noticeably steeper than the rate of decline In test length 
for 1*^1^+* Second, for l^^^+f which required the fewest Items at high 0 

levels, bias (see Figure 7) was least pronounced at hlgji 9 levels; yet for 
r^^-, which required fewest Items at low 0 levels, there Is no apparent 

advantage at those levels In terms of bias. 



20 



16 H 

Z 12 

Q 



2 

Z 



8- 



Figure 9 

Smoothed Information Curves for the fiayesian Sequential 
Test under Three Different Item Fool Configurations 




I 1 1 T 

■ao -zo -ifi 0 

ABILITY (0) 



10 



I 

20 



3jO 



Information . Figure 9 contains smoothed Information curves for the 
three Item pool configurations. (Numerical values of the estimated slopes, 
conditional standard deviations, and Information values at each of the 31 
0 levels are shown In Appendix Table A-2.) For the r^j^O configuration the 

Infonnatlon curve shown In Figure 9 Is convex, reaching Its maximum height 
very near 0*0; the curve slopes gradually downward as 0 Increases above 0, 
and more rapidly downward as 0 decreases from 0. At O^-S the Information 
curve Is quite low. Indicating that despite the availability of test It^s 
at all difficulty levels, the test scores will discriminate very poorly In 
the low ability ranges. 



ERIC 



30 



-26- 



For the r^j^+ configuration the Information value at 0=-3 Is even 

lower, but It Increases steadily — alinost linearly — with 0» The r ,+ 

ab , 

Information curve surpasses that of r^^^O at 0>+l, as expected from the 

avallabllltv of more discriminating Items In the higher difficulty ranges* 
For the configuration, which had Its lowest Item discriminations In 

the higher difficulty ranges, the Information curve Is quite low at high 
ability levels, and It Increases steadily as 6 decreases, to about 0-=O* 
Surprisingly, the Information curve thereafter decreases with 0, reaching 
its lowest point at 0=-3» This is a striking result in view of the avail- 
ability of more discriminating items at low 0 levels for the V , item pool* 

ab 

It can be partly, but not entirely, accounted for by the shorter test lengths 
seen for the configuration at the low ability levels* ' 



General Summary and Conclusions 

Previous research (e»g», Urry, 1971, 1974; Jensema, 1972) lias shorn 
that Owen*s Bayeslan sequential approach to adaptive testing has the 
potential of achieving very high correlations between ability level and 
ability estimate concomitant with a significant savings in test length, 
compared to conventional testing procedures* In order for this potential 
to be realized, a relatively large item pool was required, with highly 
discriminating items (a>*80) rectangularly distributed on the difficulty 
continuum (Urry, 1974). Study 1 corroborated the findings of Urry and 
Jensema in terms of test length and values of the fidelity coefficients. 
At the same time Study 1 revealed an overall tendency for the Bayeslan 
trait estimators to overestimate group mean ability level* Also, the 
results of Study 1 corroborated the finding in live-testing that with 
Owen*s strategy test length covarles positively with ability level* 

The results of Study 1 were not definitive, partly because finite 
item pools were employed. Study 2 overcame the specificity of Study 1 by 
introducing the use of a "perfect*' (or infinite) item pool, having unlim- 
ited numbers of Independent It^s at any difficulty level. At the same 
time. Study 2 varied the values of the guessing parameter* 

The results of Study 2 suggest that the bias problem seen in Study 1 
may be largely a result of guessing; under the no<-guesslng condition bias 
was virtually zero, except for the very highly discriminating item pools. 
This relationship was confounded with test length, however, since the 
highly discriminating item pools reached the test termination criterion in 
a very small number of items (e*g*, 5 items at a"3*00). Under the 
corrected-*guesslng condition, bias was consistently positive, and Increased 
as item discriminations Increased and mean test length became very short. 
Under the uncorrected-guesslng condition, both bias and mean absolute 
error were pronounced * 

The high correlation between test length and ability level was con- 
sistently present in Study 2 under the corrected-guesslng condition* Under 

31 

ERIC 



-27- 



no-guessing and uncorrected-guessing, however, there was no such correla- 
tion because there was no variance in test length within a test* Under 
the latter conditions, test length varied only across tests — i.e», as a 
function oC item discriminating power* 

In terms oC fidelity coeCCicients, there was no appreciable difference 
between those obtained under no-guessing and under corrected- guessing, 
given tl^e common termination criterion* Under uncorrected-guessing, 
however, \there was some loss oC fidelity as test length decreased* It 
should be noted that the uncorrected-guessing condition was tantamount to 
assuming an inappropriate item response model* The result oC using the 
inappropriate model to estimate ability and to select items sequentially 
was to introduce large errors oC estimate and some loss oC fidelity* 

The observation that bias, absolute error, and fidelity seemed to be 
adversely aCCected by the short test lengths typical oC highly discrimi- 
nating item pools led to using a fixed 30-item test length in Study 3* 
The results conCirmed the hypothesis that some undesirable psychometric 
properties may accompany the use oC very highly discriminating item pools 
iC the posterior variance criterion is used to terminate testing* When 
test length remained constant, bias was virtually zero and absolute error 
diminished steadily as item discrimination increased* 

The interrelationships of test length, item discrimination, bias, and 
absolute error would be a CruitCul avenue Cor further research* l£ the 
interdependencies were understood it would be possible Cor a test user to 
control error magnitudes by appropriate choice oC test length, given knowl~ 
edge oC the parameters oC the items in the item pool* 

Study 4 investigated some oC the characteristics studied earlier but 
as a function o£ trait level* The curvilinear regression of the latent 
trait estimators on trait level illustrates the conservative nature oC Eayes 
estimators* Fairly accurate estimation is achieved in the vicinity oC the 
assumed prior mean, at the expense oC accuracy in the extremes* In a 
sense, the Eayesian procedure gives little '^credence*' to extreme trait 
values; this conservatism results "^n a consistent tendency to underestimate 
high trait level values and to overestimate low ones* With guessing present 
the overestimation problem becomes accentuated* This alone may be su££ic- 
ient to explain the positive bias seen in Studies 1 and 2: The overesti^ 
mates tend to be oC larger magnitude than the underestimates, resulting In 
an overall tendency towards overestimation* 

More signiCicant than the direction oC the conditional bias is its 
£orm* Under all three ttem pool configurations in Study 4, the bias curves 
were non-linear^ In ability testing, bias is not usually oC concern as 
long as it is constant or linear in the parameter being estimated (Lord, 
1970, p* 153), since these two cases imply a linear relationship between 
test scores and trait level parameters* llon^linear bias, on the other hand. 
Implies a non^linear relationship, which in turn adversely affects the 
utility oC the test scores* Other things being equal (e*g*, the conditional 
variances oC the test scores), iC the regression oC test scores on trait 
level is non-linear, the scores will sr^ke better discriminations at some 
trait levels than at others* 

32 



That this is the case with the scores resulting from Bayeiian test 
administration is evident in the information curves estimated from the data* 
Although adaptive testing has the potential to result in equi-discriminating 
ability estimates^ the Bayesian sequential adaptive test has failed to 
achieve this goal under the conditions simulated in Study 4» Under each 
item pool configuration, some region of the ability continuum had consider- 
ably higher levels of information under any configuration* Even under the 

configuration, where the best discriminating items were available in 

the lowest difficulty regions, the information curve was very low in the low 
ability region*^^ 

Lord (1970, p» 152) indicated that evaluating an adaptive test by means 
of a group statistic (such as the fidelity coefficient, r^^) presumes some 

knowledge of the group's distribution on the trait being measured, and 
ignores information relevant to the accuracy of trait estimates at any one 
level of the trait* The validity of the Bayesian sequential test trait 
estimates, as the results show, was quite high under the conditions used in 
these simulation studies* The accuracy of the estimates was also favorable 
in what corresponds to the middle ranges, of a normal distribution on 0, but 
was found to be less favorable in the extremes, especially the lower extreme* 
Similarly, the information curves of the trait estimates showed that the 
effectiveness of measurement under the Bayesian testing procedure varied 
systematically as a function of the configuration of the item parameters 
constituting the item pool, but in all three configurations measurement 
effectiveness was very low in the low ranges of the trait* 

The observed loss of accuracy and information in the extremes of the 
"typical'' range of 0 are disturbing, since a major advantage of adaptive 
testing over conventional testing is the former's supposed potential for 
superior measurement accuracy and effectiveness in those extremes* The data 
of this series of studies show that with the exception of the config- 
uration, the adaptive test scores behave much like conventional test scores, 
at least in terms of the shapes of their information curves* The utility of 
the Bayesian adaptive testing strategy may be diminished by results like 
those reported for Study 4, if they prove to be general* 

The problems of bias which is non-linear in 6, and of convex Infor- 
mation curves as observed in Study 4, have causes which ntay be amenable 
to improvement* Central to both problems is the effect of guessing, which 
generally operates to reduce measurement efficiency at all trait levels, 
and especially at low trait levels* Also at the core of the problems 
is the Bayesian procedure itself* As was pointed out earlier, the Bayesian 
trait estimates behave like regression estimates* Extreme values of 0 
are systematically regressed toward the initial prior estimate; the 
assumption of a normal prior distribution of 0 ensures this tendency* 
On the average, the more extreme Q is for any individual, the larger 
will be the regression effect* Recall that the item selection procedure 
selects an item with difficulty somewhat easier than the current 0 
estimate* But for high 0 the current estimate is almost always too low* 



ERIC 33 



-29- 



Hence the difficulty of the selected Item will almost always be too easy 
for extremely able examinees. Cumulated over 30 items, for example, 
there will be several effects of this inappropriate item selection: 

1. Mean proportion correct will tend to increase as a function 
of 6, despite the inplicit attenpt of the tailoring procedure 
to make it constant at all levels of 0; 

2. 6 will tend to be underestimated for high 0 due to the inap- 
propriate difficulty of the test items administered; 

3. Information loss will occur at high 6 due to the shallowing \ 
slope of the regression of © on 0. 

For low 0 the initial prior is an overestimate. Hence the first 
item selected will generally be too difficult, yet the examinee has a 
chance of answering it correctly by guessing. A correct answer, of course, 
will cause an increase in 0 and thus result in another inappropriate choice 
of item difficulty. Furthermore, as Samejima (1973) has shown, when 
guessing is a factor there may actually be negative information in a 
correct response to an item whose difficulty exceeds an examinee^s 
actual trait level by a fairly small increment. Thus it appears that in 
Owen^s Bayesian strategy, testees in the low extremes of 0 are rather 
consistently being administered overly difficult items with several 
systematic results: 

1* Mean proportion correct tends to decrease with 6 despite the 
tailoring process; 

2. Posterior variance reduction tends to be more rapid for individuals 
of low trait levels, due largely to their sub-optimal proportion 

of correct responses, resulting in shorter mean test length; 

3. The shorter the test length, the less opportunity the Bayesian 
estimation procedure has to converge to extreme trait level 
estimates; 

4. Non "Convergence combines with negative information in some correct 
responses to diminish severely the effectiveness of measurement in 
the low regions of the trait. 

Some of the conclusions just stated are speculative. Specifically, 
neither proportion correct as a function of 0 nor the differences (2> -0) 

were examined in this study. Both of these reflect the effectiveness of 
the tailoring process. McBride (1975), however, reported data which 
showed proportion correct to be monotonically related to 0 In another 
simulation study of Owen's Bayesian strategy. 

One goal of adaptive testing should be to achieve a constant high 
level of measurement effectiveness at all levels of 0. This objective 
is equivalent to a high, horizontal information function. The Study 4 



ERLC 



34 



-30- 



results show that the Bayesian sequential testing strategy failed to 
achieve this goal despite an unrealistically favorable set of circum- 
stances: the perfect item pool> error-free item parameters* and a scoring 
model perfectly congruent with the item response model* The shortcomings 
of the Bayesian trait estimate were attributed to the regression-like 
tendency of the sequential estimates themselves > which in turn results in 
inappropriate item selection for individuals whose trait levels are 
relatively high or low* 

There are at least two methods of ameliorating this problem* both 
of which to some extent should lessen the bias of estimate at the extretxkes 
and improve the information properties of the trait estimates* The first 
txkethod involves the assumption of a rectangular rather than a normal prior 
distribution of 0* The second method would involve replacing the Bayesian 
item selection procedure with a mechanical (e*g*> non-mathematical) 
branching procedure > which would be less sensitive to large errors in the 
current trait estimate in its choice of the next item to administer* 
Needless to say> both of these alternatives involve a considerable 
departure from Owen^s elegant procedure* 

Implications * In testing persons of any given ability level, an 
ideal adaptive testing strategy would select for administration the most 
informative items available at that level* If the item pool were adequate, 
the result would be that mean proportion correct would be approximately 
constant !:cross ability levels, and the information curve of the ability 
estimates would be very high and almost flat* Such an adaptive test would 
make equally good discriminations at any level of the ability trait* It 
would also have approximately equi-^^alent utility at any level at which 
discriminations were to be made* It is apparent from the foregoing 
discussion, especially from the data of Study 4> that the properties of 
the. Bayesian sequential adaptive test fall somewhat short of this ideal* 
The research reported here has shown that the Bayesian procedure results 
in very high correlations of ability level and test scores but also results 
in ability estimates which are strongly biased in the extremes and which 
are maximally informative only in the middle region of ability* If a test 
user were concerned primarily with ordering examinees as to ability level, 
the Bayesian sequential adaptive procedure would seem quite satisfactory* 
However, the tendency of the Bayesian procedure to yield accurate measurement 
in the vicinity of the prior mean at the expense of relatively inferior 
measurement elsewhere, may mandate selecting an alternative adaptive 
strategy if the test user requires either equi-discriminating measurement 
over a wide ability range or accurate ability estimation for ability levels 
not near the mean* Sinwlation research by Vale & Weiss (1975) on 
the stradaptive ability test (Weiss, 1973) shows that adaptive testing 
strategy provides measurement with the desired characteristics* Other 
promising strategies for adaptive testing have been proposed by Lord 
(1975) and Samejima (1975)* 



35 



ERLC 



-31- 



References 

Betz, K.E. & Weiss, D.J. Slinulatlon studies of two-stage ability testings 
Research Report 74-4, Psychometric Methods Program, Department of 
Psychology, University of Minnesota, Miuaeapolla, 1974. 

Blrnbaum, A. Some latent trait models and their use In Inferring an 
examinee's ability. In Lord, F.M. and Kovr<;k, M.R., Statistical 
theories of mental test scores . Reading, Mass.: Addison -Wesley, 
1968 (Chapters 17-20). 

Green, B.F. Discussion. In Proceedings of the Conference on Computerized 
Adaptive Testing. Washington, D.C. June, 1975. 

Jensema, C.J» An application of latent trait mental test theory to the 
Washington Pre-College Testing Program. Unpublished doctoral 
dissertation. University of Washington, 1972. 

Jensema, C.J. The validity of Bayeslan tailored testing. Educational and 
Psychological Measurement , 1974, 34^, 757-766. 

Lord, F.M. Some test theory for tailored testing- In Holtzman, W-H. 
(£d,). Compute r"as sis ted Instruction, testing, and guidance . 
Hew York; Harper & Row, 1970 (Chapter 8). 

Lord, F.M. The "ability" scale In Item characteristic curve theory. 

Research Bulletin 74-19. Princeton, K.J. : Educational Testing Service, 
June, 1974. 

Lord, F.M. A broad-range test of verbal ability. Research Bulletin 75-5. 
Princeton, K.J.; Educational Testing Service, 1975. 

Lord, F.M. & Novlck, M.R. Statistical theories of mental test scores . 
Reading, ffasc: Addison-Wesley, 1968. 

McBrlde, J.R. Troblem; scoring adaptive tests. In Weiss, D.J. (Ed.) 
Computerized adaptive trait measurement; problems and prospects . 
Research Report 75-5, Psychometric Methods Program, Department of 
Psychology.^ University of Minnesota, Minneapolis, 1975. 

Owen, R.J, A Bayeslan approach to tailored testing. Research Bulletin 
69-52. Princeton, K.J. ; Educational Testing Service, 1969. 

(Ven, R.J. A Bayeslan sequential procedure for quantal response In the 
context of adaptive mental testing. Journal of the American 
Statistical Association , 1975, 70, 351-356. 

Samejlma, F. A comment on Blmbaum's three-parameter logistic model In 
the latent trait theory. Psychometrlka , 1973, 38, 221-233. 

Samejlma, F. Behavior of the maximum likelihood estimate in a simulated 
tailored testing situation. Paper presented at the meeting of the 
Psychometric Society, Iowa City, April, 1975. 



36 



-32- 



Urry, V.W. Individualized testing by Bayesian estimation. Research Bulletin 
0171-177. Seattle*. Bureau of Testing, University of Washington, 1971. 

Urry, V*W* Computer-assisted testing: the calibration and evaluation 

of the verbal ability bank* Technical Study 74-3* Washington, D*C.: 
i^U*S. Civil Service Commission, Personnel Research and Development 
' Center, December 1974. 

Vale, CD* fii Weiss, D»J» A simulation stOdy of stradaptive ability testing s 
Research Report 75-6, Psychometric Methods Program, Department of 
Psychology, University of Minnesota, Minneapolis, 1975* 

Weiss, D*J* The stratified adaptive computerized ability test . Research 
Report 73-3, Psychometric Methods Program, Department of Psychology, 
University of Minnesota, Minneapolis, 1973. 

Weiss, D.J. Strategies of adaptive ability measurement . Research Report 
74-5, Psychometric Methods Program, Department of Psychology, Univer- 
sity of Minnesota, Minneapolis, 1974. 

Wood, R. Computerized adaptive sequential testing. Unpublished doctoral 
dissertation. University of Chicago, 1971. 



ERLC 



37 



-33- 



AFPENDXX: 
Supplementary Tables 









A! 




























H 








■a H 




• 








> 


















H 












X H 
























« o 








c 






o 






*H 










at pt] 


















th e 




o 










C 






o 




H O 






*a u-< *H 






V$ Q U 


H 






o 




U S 


O 




O P 






M ^ 






U *H 


fi 






JS 




c 






c o 






I"" 












o 






* o 


































H 








o at 




*i 


to & 


















tA 






H O 












Mean 







t© 



<0 



HHHHHHHHc^lc^^c^^c^lc^l<^c^lc^ 



1 



I 



I I I 



I 



I 



m m m 
H m '^t in 



Cjl tji cji cji 



I I I 



HHHiHHiH<MC4cM<M 



OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 

m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m 



cstcstHm^^o^o\^OQOcn^QOr^cno\^^oocst 

Q0Q0r^mHOi-H'<fC^mHmrHC^^rHC4rH'<fr^cncnQ0 

• • .* • • • • • • • • • 



r^o\m^**o^m 



csi csi cn o\ o 
cn H o\ m 



1 



csiOfHcn^r^aOHcn**^aoocsicnmr^ 



o o 



o 


o 


o 


o 


o 


o 


o 


o 


o o 


o 


o 


o o 


O O o\ o\ QO 






m 






cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn cn 


cn 


cn 


cn cn 


cn cn csi csi csi 












<M 




C4 






^ 


o 




o 


cn 






cn m 




o 




cn cn 00 ^ H H 






o 


cn 




<M 


^ 




o\ 




o 


o\ 




o 








m 


cn 




^ m 


o m ^ ^ m 


cn 






H 


o 


cn 






00 


m 




cn 








H 


o 


o o 


o 


o 


o o 


o o o o o o 


o 


o 


o 


H 


H 


H 


H 


rH 


H 






















1 


1 1 


111111 






1 






1 




1 


1 



0\^^OHOO^^tstcn^^^m^om cn^mrHr^c^c^o\a^r^r^cnH 
c^c^lOOoor^^**c^lr^o\r^m**c^lOc^lc^mf^<^r^c^mr^oooc^^**^oo 



1111 



O00^'<f<MO00^'<f<MO00^'<fcMO<M'<f^00OcM'<f^00O<M'<f^00O 



38 



-34- 



Table A-2 

Estimated Value of the Derivative Conditional Standard 

Deviation and Value of the Information Function lg(0) 

for Three Item Pool Configurations, at Each of 31 Trait Levels (9) 



Item Pool Configuration 







V V+.71 
ab 






r vO 
ab 












do 


"ale 




s6 






do 




Ia(9) 


-3.0 


-523 


.307 


2.90 


.588 


.336 


2.58 


.450 


.:!53 


1.63 


-2.8 


.566 


.353 


2.57 


.629 


.333 


3.57 


.511 


.308 


2.75 


-2.6 


.607 


.328 


3.42 


.668 


.304 


4.8? 


.568 


.279 


4.14 


-2.4 


.645 


.341 


3.58 


.704 


.283 


6.20 


.621 


.264 


5.54 


-2.2 


.682 


.321 


4.51 


.738 


.294 


6.31 


.670 


.268 


6.26 


-2.0 


.716 


.330 


4.71 


.770 


.284 


7.35 


.716 


.289 


6.14 


-1.8 


.748 


.324 


5.33 


.799 


.228 


12.29 


.758 


.289 


6.87 


-1.6 


.778 


.257 


6.26 


.826 


.266 


9.64 


.796 


'.247 


10.37 


-1.4 


.783 


.311 


6.34 


.850 


.265 


10.29 


.830 


.230 


13.01 


-1.2 


.832 


.314 


7.01 


.872 


.261 


11.16 


.860 


.251 


11.73 


-1.0 


.855 


.278 


9.46 


.892 


.275 


10.52 


.886 


.235 


14.21 


- .8 


.876 


.316 


7.69 


.909 


.278 


10.70 


.908 


.244 


13.86 


- .6 


.895 


.283 


10.00 


.924 


.260 


12.63 


.927 


.244 


X.t.44 


- .4 


.912 


.282 


10.47 


.936 


.288 


10.57 


.9 2 


.255 


l.t.66 


- .2 


.927 


.308 


9.06 


.946 


.278 


11.59 


.953 


.284 


l.i.96 


0 


.940 


.305 


9.50 


.954 


.249 


14.68 


.960 


.257 


i:,.96 


.2 


.946 


.253 


13.98 


.959 


.248 


14.96 


.963 


.284 


11.50 


.4 


.959 


.255 


14.14 


.962 


.281 


11.72 


.963 


.252 


14.59 


.6 


.965 


.287 


11.29 


.962 


.275 


12.25 


.958 


.285 


11.31 


.8 


.965 


.269 


12.85 


.960 


.248 


15.00 


.950 


.276 


11.85 


1.0 


.971 


.228 


18.15 


.956 


.250 


14.62 


.938 


.336 


7.79 


1.2 


.971 


.228 


18.13 


.949 


.250 


14.42 


.922 


.294 


9.84 


1.4 


.968 


.218 


19.71 


.940 


.272 


11.94 


.902 


.295 


9.36 


1.6 


.964 


.246 


15.35 


.928 


.259 


12.85 


.879 


.301 


8.52 


1.8- 


.957 


.229 


17.46 


.914 


.292 


9.81 - 


.851 


.317 


7.21 


2.0 


.948 


.263 


13.00 


.898 


.289 


9.66 


.820 


.296 


7.67 


2.2 


.937 


.230 


16.56 


.879 


.260 


11.43 


.785 


.321 


5.98 


2.4 


.924 


.210 


19.35 


.858 


.255 


11.32 


.746 


.294 


6.44 


2.6 


.908 


.227 


16.00 


.834 


.270 


9.55 


.703 


.349 


4.06 


2.8 


.891 


.258 


16.69 


.808 


.250 


10.46 


.657 


.332 


3.91 


3.0 


.871 


.218 


16.00 


.780 


.279 


7.82 


.606 


- .293 


4.28 



ERIC 



39 



OISTRTBOriON LIST 



4 Or. Harshtll J, Farr* Ofrector 

Personnel and Training Research Programs 
Office of Kava] Research (Code 456) 
Arlington. VA ZZZITS^ 

1 ONR Branch Office 
4gS Sumner Street 
Boston* W 02210 
ATTN; Or. Jarws Lester 

1 m Branch Office 

1030 East Green Street v 
Pasadena* CA gilOl 
ATTN; Or. Eugene Gloye 

1 ONR Branch Office 
536 South Clark Street 
Chicago, IL 60605 
ATTN: Dr. Charles E. Oavis 

1 Or. H. A. Bertln* Scientific Olrector 
Office of Htval Research 
Scientific Liaison Group/Tokyo 
Antertcan EMitssy 
APO San Francisco 96503 

1 Or. Wallace SinalVo 
c/o Office of Naval Research 
Code 450 

Arlington, VA 22217 

G 01 rector 

Kaval Research Laboratory 
Code 2627 

Washington* DC 20390 

1 Technical Director 
Nayy Personnel Research 
and Oevelopmnt Center 
San Otego* CA g2152 

1 Assistant for Research Liaison 
Bureau of Naval Personnel (Pers Or) 
Room 1416* Arlln^^on Annex 
Washlngtoh* DC 20370 

1 Asstsunt Deputy Chief of Naval 
Personnel fur Retention Analysis 
and coordination (Pers 12) 

Room 2403* Arlington Annex 

Washington* DC 20370 

1 LCOn Cherles J. Thelsen* Jr.. HSC* USN 
4024^ 

Naval Air DevelopMnt Center 
Warminster* PA 1B974 

1 Dr. Let Htller 

Naval Air Systems Comnand 
AIR-413E 

Washington* DC 20361 

1 COR Paul 0. Nelsoni HSC* USN 

Htval Hedlcal RtO Cotrnmnd (Code 44) 
National Htval Hedkal Center 
Dethesdai 140 20014 



Coomindlng Officer 
Naval Health Research Center 
San OlegOt CA g2152 
ATTK; Library 

ChalnMn 

E^havloral Science Department 
Naval COMiand t Management Division 
U.S. Naval Acadeny 
Annapolis. HD ZT402 

Chief of Naval education t Training 
Naval Air Statibn 
Pensacola* FL/ 3250B 
ATTHf CAPT Vuce Stont* USN 

Hr. Arnold / Rubinstein 
Hunun Resowces Progran Hanaoer 
Naval Hatrf-lal Coonand (0344) 
Room 104f* crystal Plaia #5 
Ifash1ng}6n. DC 20360 

Or. Ja£)c R. Borsting 
U« S/Naval Postgraduate School 
Depaftment of Operations Research 
Hon/lerey* CA g3940 

rector* Navy Occupational Tesk 
Analysis Program (NOTAP) 
iavy Personnel Program Support 
Activity 

ullding 1304* Boiling AFB 
shiM^ton* DC 20336 

Offtfce of civilian Manpower Hanagement 
Codjv64 

Washfhgton* DC 20390 
AHNAor. Richtrd J. Niehftus 

1 Office it civilian Manpower Management 
code 263^ 

Washington DC 20390 



assistant Deputy Chief 
IS (Manpower] 
ifflce 




Mr. George N. Gralne 
Naval Sea Systems Cofnmand 
SEA 047C12 

Htshtngton, DC 20362 



40 




1 Chief of Ktval T^tmlcal Training 
Naval Air Station HMphls (75) 
HlllingtOA. TH 380$4 
ATTHs Or HonMn 0. Kerr 

1 Principal Civilian I'^dvlsor 
for Education and Training 
Naval Training Comnnd* Code DM 
Pensacola. FL 32508 
ATTN; Dr. Ullllam L Ha1oy\ 

1 Director 

Training Analysis I Evaluation Group 
Codt H-OOt 

Deoartnent of tht Navy 
DriandOt Fl 32S13 
ATTN: q^r. Alfrtd F. Smode 

1 Chief of Haval Training Support 
Code H-21 
Building 45 \ 
Naval Air Station 
Pensacola* Fl \3250fi 



1 



er 



Naval Undersea Ce 

Code 6032 

San Diego* CA 921 
ATTN: Gary Thomsd 

1 LCDR C. F. Logan* USN 
F-14 Hanageoent System 
COMF1 TAEWINGPAC 
NAS Hlrar^^r* CA 921 45 

1 Navy Personnel Research 

and Development Center 
Code D1 

San Diego* CA 92152 

5 Navy Personnel Research 
and Development Centei? 
Code D2 / 
San Diego* CA 92152 / 
ATTN: A. A. SJoholm / , 

2 Navy Personnel Resea^h / 

and Oevelopment Center 
Code 304 / 
San Diego* CA 92i1S2 
ATTN: Or. John Fdfrd 

2 Nayy Personnel Research 
and Developmei^ Center 
Codt 3]D 

San Diego. CA / 92152 
ATTN: Dr, Birtln F. Wiskoff 

2 Knvjr F«r«onn^ 1U9«Arch 
Qnd Dev«lo^Mnt Centtr 
Code 3D9 

San Dle|o,/CA 92152 
ATTSi Dr. C. COrjr 

1 Nayy Personnel Research 
and Development Center 
San Diego* CA 9^152 
ATTN; Library 



Navy Personnel Research 
and Oevelopment Center 
Code 9041 

San Diego. CA 92152 
ATTN; Dr, 0. D. FleUher 

D- M. Gragq* CAPT* HC/tlSN 
Head* Educational Pi^^rams Oevelopment 

Department 
Naval Health Sdf^es Education and 

Training C 
Bethesda, HO /20014 



1 Technical Director 

U.S. Am Research Institute for the 

eehivloral anl Social Sciences 
130vH11son BOHievard 
ArMngton* VA 22209 

eadquarters 

J.S. Armjf Administration Center 
Personnel Administration Combat 

Oevelopment Activity 
ATCP-MQ 

Ft. atnjanln Harrison* IN 46249 

1\ Armed Forces SUff College 
\Norfo1k* VA 23511 
^TTN: Library 



1 HQ USAREUR i 7th Ar^y 
ODCSOFS 

USAREUR Director of GEO 
APO New York D94D3 

1 ARI Field Unit • Leavenworth 
Post Office aox 3122 
Fort Leavenworth* KS 66027 

1 Hr. James 6aker 

U.S. Anny Research Institute for the 

Behavioral and Social Sciences 
1300 Wilson Boulevard 
Arlington* VA 22209 



1 Dr. James L. ftaney 

UpSp Artny Research Institute for the 

Behavioral and Social Sciences 
1300 Wilson Boulevard 
Arlington* VA 22209 

1 Dr. Hilton S. (Catz* Chief 

Individual Training i Performance 

Evaluation 
U.S. krmy Research Institute for the 

Behavioral and Social Sciences 
1300 Wilson Boulevard 
Arlington, VA 22209 

Air Force 



1 



indent 

U\s. Anv infantry School 
F{Art Penning* GA 31905 
ATW: AT5H-DET 



Or. Ralph Asek 

U.S. Anny Reuarch Institute for the 

Behavioral m Social Sciences 
1300 Wilson Bdh]«vard 
Mngton* VA \22D9 

. Dr\Leon H. Nawroci,, 

U.SXAmty Research Institute for the 

Bewloral and Social Sciences 
1300 Wmon Boulevard 
ArllngtoKVA 22209 

1 Dr« Joseph Ifard 

U.S. Ani\y Research Institute for the 

Sehavloral and Social Sciences 
1300 Wilson Soulevard 
Arlington* VA 22209 



1 



1 



Oepdty CoMMndcr 

U.sAAmv institute of Administration 
Fort Benjamin Harrison* IN 46216 
ATTNA EA 1 

Dr. Stinley L. Cohen 

U.S. Arity Research Institute for the 

Sehavftral and Social Sciences 1 
1300 Wnipn ftoulevard 
ArllngtonVVA 22209 



Research Branch 
AF/OPHYAR 

Randolph AF^^TX 7B148 

Dr. G. A. Eckstrand (AFHRL/AST) 
tfrlght-PattersOT AFB 
Ohio 45433 



AFHRl/OOJN 
Stop 163 

Lackland AFB* TX 



70236 



1 



Dr. Hartln RockMay (AFHRL/TT) 
Lowry AFB 
Colorado 90230 

Dr. Alfred R. Fregly 
AFOSR/NL 

1400 Wilson Boulevard 
Arlington* VA 22209 



AFHRL/PED 
Stop 163 

lackland AFB* TX 



7B236 



1 Major Wayne 5. Sellman 
Chief of Personnel Testing 
HQ USAF/OPKYP 
Randolph AFB* TX 7B14B 



41 



Marine Corpi 

1 Ofrector> Offfcc of NAnpOMtr 

HddqiMrtersi NArlne Corps (Cod« m) 
HC9 (Buflding 2009) 
Qiiantfco, VA 22m 

1 Or* A. L. Simosky 

Scfentlffc AtfvUor (Code RO-1) 
Headquarters* U.S. Narlnt Corps 
V«shfn9ton* DC 20360 

1 Chief, AcAdtfflk Oep«rtment 
CducAtloft CenUr 
Hftrinc Corps Dtvtlopwnt and 

CducAtfon CoHMftd 
HArtna Corps Base 
Quantlco, VA 22134 

1 Nr* C* A. Dover 

2711 South VeUch Street 
Arlington* VA 22206 

Coast Guard 



Other Goverwne nt 

1 Dr* Lorraine 0* £yde 

Personnel Itasaarch and Development 
Ctntar 

0*S* Cfvfl Servfca Cofffifssfon . 
1900 t Stmt* K.V* 
^hfn^tofi* DC 20415 

0 *\f111lM GorhM* Ofrector 
Parsbnnaf Jttsaarch and Oevt1opinen| 
Ctfli 

0*S* C\4l1 Servfca Connfsslofi 
1900 C areet* 
Vashtn9t|p* DC 20415 



1 nr. Joseph J. Cowan* Chfef 

Psychological Research Branch (G-P^ 
U.S* Coast Guard Headquarters 
Washington* DC 20590 



1/62) 



Other 000 

1 Military Assistant for Hunan Resources 
Office of tht Stcretary of Defense 
Room 30129. Pentagon 

Washington* DC 20301 1 

1 Or. Harold O'Htll* Or* 

Advanced Research Projects Agency 
Huatan Resources Research Office 
1400 Alison Bmilavard 
Arlington* VA 22209 

1 Dr* Robert Young 

Advanced Research ProJecU Agency ■ 
Human Resources Research Office 
1400 Wilson Boulevard 
Arlington* VA 22209 

1 Nr. Frederlcic W* Suffa 

Chief* Recruiting ind Retention Evaluation 
Office of the AsslsUnt SecreUry of , 
Defense* HIRA) ' 
Room 3O970i Pentagon 
Washington* DC 20301 



12 Defense Documentation Center 
Cameron Station. Building 5 
Alexandria, VA 22314 
ATTH: TC 



[Search arid Devel 



Or* Vem 
Personnel 
Center 

0*S* Civil SA^vIca Coenis 
1900 t Streat\K*W* 
Washington* DC\ 20415 



ent 



ion 



Or. Harold T* Ya\r 
Personnel Resear^ an<t/Dtve1op 
Center 

U.S. Civit Service Y^lnnilsslony 
1900 t Street, K.W*1 
Washington* DC 

Or* Richard C* At/lnsJ 
Deputy Director 
National Science^ Founded 
1B0O G Street* W*w*. 
Washington* DC/ 20550 

Or* Andrew R*/«Dlnar 
Technologlcaj Innovat/ons 

Education Aroup 
National Science Foi^j^datlon 
1800 G StrAt, N*W*i 
W^shlngtoi^ DC 2q650 



0*S* C1v/1 Servlii Comlsslon 
Federal Afflce Building 
ChlcamReglonaVSUff Division 
Reglonfl Psychologist 
230 South Oetriom Street 
Chlcaio, IL i0604 
ATTN/ C* S. Jflnlewlct 

Or*/car1 FrdBarlltsen 
Lefming Ol/1s1on* Basic Skills Group 
HMlonal Irptltuta of Education 
1/O0 Igth ltreat*K.V. 
Ashlngtont DC 20206 



Min Annett 
Bepertfltent of Psychology 
^Tht University of Warwick 
Coventry CV47AL 
ENGLAND 

Or^ Oertltf V* Barrett 
Onlvajilty of Akron 
OepertMOnt of Psychologjr 
Aki^* OK 44325 

, Barnard N* Bass 

^nlvtrslty of Rochester 
,&raduata School of Nanagenient 
'^RochesUr* NY 14627 

Century Research Corporatlot^ 
4113 Let Highway 
Arlington* VA 22207 

1 Or* Kenneth E* dark 
University of Rochester 
college of ArU and sciences 
River zm^ut Station 
Rochester, NY 14627 

1 Or* NOMn Cliff 

University of Southern California 
Oepertiient of Psychologiy 
University Pirl: 
Los An9t1es* CA 90007 

1 Or* Allan N. Collins 

Bolt Beranek and NeMmn* tnc« 
SO f^lton Street 
Cairi>r1d9t* HA 02138 

1 Dr* Rene* V* Dawls 
University of Minnesota 
Department of Psychologjr 
Minneapolis* HK 55455 

Or* Ruth Diy 
Yale University 
Depertntn': of Psychology 
^2 Hlllhouse Avenue 
Hew Haven* CT 06520 



Hvcallaneoyi 



Or* Scarvh B* Anderson 
Educe tlonaNIts ting Service 
17 Executive Park i»r1ve* N*E* 
Atlanta* GA 30329 



1 DriNoman R« Olxon 
ZOOASouth crtlg Street 
Univtftslty of Pittsburgh 
P1ttsb\gh, PA 15260 

1 \or. HarvliKp* Dunnette 
Ivarslty of Ninnasota 
partment oMsychology 
l1nneapo11s> 55455 

1 ERIC 

Processing and Reference Facility 
4B33 Rugby Avenue 
Bethesda» NO 20014 



ERLC 



42 



1 Cr. Victor FUldt 
^tont^mry Colltft 
C^pArtimt of Piycholofty 

Rockviiit, m zoeso 

1 Or* Edwfn A* FUlihMn 
Vliltlng troftiior 
Unlverilty of UllfOffiU 
Graduatt Schoti of Adnlnlitrttlon 
Irvine, CA fK64 

1 Dr* Robert fiUior^ Co-Olrtctor 

Unlvtrilty of rittst^urfh 

3939 0*H«v« Stmt . 
« PULiburghp 15211 

1 Hr* Harry H. H«rMn 

EducAtlwiAl Ttiting StrvVco 
PrlncttOfit MJ OeS40 

1 Or. Richard S* Hatch _ 
OtclilOfl Syitm A»ocUtoi\Inc. 
5(40 Nicholion L«nt 
Rockv111t> HO ZttSZ 

1 Or. H. 0. Htvron 

llurr^n Scltncts Rtit«rch» I«tc. 
7710 Old Spr1n9 Kouie RMd 
Hcit G«tt InduitrUI Park 
l^Lt«n. VA 22101 

1 tMRO Ctntr«1 Olvlilon 
400 PUZA Bulldfnq 
P«c« BoultvArd At FilrfUld OrfVt 
PcnSACoUt FL 3Z50S 

1 Hui«RO/Ueitom Divlilon 
27857 Mrwick Drlvt 
C«nit]t CA 9nZ1 
AUK: Library 

1 HwMO Ctfttrti Ofvlilon/Coluitet Offico 
Sufto Z3t 2501 Croti Country Drlvo 
Co1ui6ui, «A 31906 

1 HuvftHO 

Joitpti A. Austin Building 
1919 Coldinlth L«nt 
Loulivlllo* ICY 40111 

1 Dr. LAwrtnco B. Johnioo . 
Lwrcitco Johnion I Asioclattip Iftc./ 

200) .s stmLt **-?6ftoi"'** 



1 Dr* Frt4tr1ck Lord 
EducAtlonal Toitino Strvico 
r-f«tcoton« MJ OtHO 



} 



Uihlngtont o( *zi 



203 IkM H«ll 

Florida 3t«ttt University 



Or. Dtvid IC1«hr 
C«mt9fc*Hi11on Unlvtrslty 
0«p«rtii»nt of riycho109y 
nttiburght M 15213 ( 



Dr. Kobtrt R* HickU 
HwM FACtort temrchp 
$7m Gorton Drlvt 
S«nta lorbirA Nsttrch Pkrk 
«o1oUt CA 93017 



1 Dr. Uiniui C. 

Unlvtrslty of Sotithtm C«11foffljA 
InfOMtlofi ScltiKOi Initltut 
4(71 Adttlnlty May 
Hirint Dtl My, CA 

1 Hr* Ediond HiHti 
315 Old Hiln 

NnniylvAnU SUU^UnlVffrilty 
Unlvtrslty PktkJfk 16802 

1 Dr. Lto Hbfldi/vico rrtildmt 
AfftrlCAn ColJvgi Toitfn9 ProgrM 
r.O. Box U 
lOM CIty/IA S2240 

1 Dr* DmAlfd A. 

Unlvtruty of CillfomUt S«n Olt^o 
Dep«r)wnt of Piychology 
L«Jol/«« CA 92037 

1 Hr. A. J. Pticht Prttldmt 
Ecllctech AiiocUtfttt Inc. 
pS. Box 171 

|frth Stonlngton* CT 0(359 

OUnt M. ltt«Hy-IClM 
-IC Htmrch I Syitom Dnlgn 
riH7 Rldotfont Drlvt 
HillbUtCA foza 

Dr. Joitph It* Rignty 
IMIVtrslty of Sduthim CillfornU 
8thav1ora1 Ttchnolftsy L«bor«torUi 
3717 South Cr«M 
^Loi Aii9t1tt, CA 90007 

Otorgt E. RowUnd 
riUnd And Comn^t Inc. 
P\ Box II 

H«d^f1tld* MJ 08033 

Juiln Schntldtr 
Ity of HiryUnd 
DtNrtMt of ^jrcholo^jr 
Coilm rark. W 



20742 



. Dr. Arthur I.N51tiil , , 
Applltd riycho1^Tc«1 Strvlcti 
404 E«it L«nc»t«cAvtnuf 
ttiytitt Pk 1«*7 ^ 

1 Dr. HiAry P. S1m> Jr. 
Room 130 - luilntii 
IndUna Unlvtrslty 
BloontnBtont tR ^7401 



I Dr* Rfchird Sm 
SUnford Unfvfrslty 
School of EducAtlOfi 
Stanford* CA 94305 

Hr^ Gtorft WhtAton 
A«tr1c«n Initltutti for Rtmrch 
3301 MmtHikIco Avenut* HM. 
Uiih1n9ton« DC 20015 

Dr. IC. Itoicourt 

SUnford Unlvtrslty 

Initltutt for HithMiitlCAl Studfti 

In tht SocUl Scltncti 
SUnford CA 94305 

Richard T. tlovd^ 

Coll«c» or BuslnftSf Administration 
tfnlvtrtlty of KtbraskA, Lincoln 
Llrtcoln, KC 66508 

Dr. John J. Colllni 
Vict Prtildtnt 
EiitK Corporation 
6305 CMlnlto EitrtlUdo 
San 01«9o* CA 92120 

Dr. Lylt Schoenftldt,, 
Dtpftrtiwnt of Psychology 
Unlvtrslty of Gtorgln 
Athcnsr Gtorgln 30102 

Dr* Patrick Suppti* Ofrector 
Initltutt for HathMtlcal Studlti 

In tht Social Sclncti 
SUnford Unlvtrslty 
Stanford* CA 94305 



43 



ERIC 



Previous Reportg in thli Serleg 



73-l# Wels9» D.J. i Bett, N.E. Ability Eteasurcmcnt ; Convgntlonal or Adaptive? 
February X973. (AD 757788). 

73-2. Bejar, I.I. & Welaa, D.J. CoBparlftT>n of Pour Enplrlcal Differential 
Itea Scoring Procedures . August 19 TT! 

73- 3. Welss» D.J. The Stratified Adaptive Coiiff>uterlged Ability Test . 

September 1973. (AD 768376). 

^3-4. Bet2, N.E. i Weiss, D.J. An ^plrlcal Study of Computer-Administered 
/ TWo-Stage Ability Testing . October 1973. (AD 768993). 

74- 1. DeWltt, L.J. & Weiss, D.J. A Computer Software System for Adaptive 

Ability Measurement . January 1974. (AD 773961). 

74-2. McBrlde, J.R. i Weiss, D.J. A Word Knowledge Item Pool for Adaptive 
Ability Measurement . June 1974. (AD 781894). 

74-3. Larkln, K.C. & Weiss, D.J. An Empirical Investigation of Computer- 
Administered Pyramidal Ability Testing . July 1974. (AD 783553). 

74-4. Bett, M.E. & Weiss, D.J. Simulation Studies of TWo-Stage Ability 
Testing . October 1974. (AD A001230). 

74- 5. Weiss, D.J. Strategies of Adaptive Ability Measurentent . December 

1974. (AD A004270). 

75- 1. Larkln, K.C. & Weiss, D.J. An Empirical Comparison of Two-Stage and 

Pyramidal Adaptive Ability Testing . February 1975. (AD A006733). 

75^2. McBrlde, J.R. i Weiss, D.J. TETREST: A FORTRAN IV Program for 
Calculating Tetrachorlc Correlations . March 1975. (AD A007572). 

75-3. Beti;, N.E. & Weiss, D.J. Empirical and Simulation Studies of Flexllevel 
Ability Testing . July 1975. (AD A013185). 

75-4. Vale, C.D. & Weiss, D.J. A Study of Computer-Administered Stradaptlve 
Ability Testing . October 1975. (AD A018758). 

75-5. Weiss, D.J. (Ed.). Computerized Adaptive Trait Measurement; Problems 
and Prospects . November 1975. (AD A018675). 

75-6. Vale, CD. & Welsd, D.J. A Simulation Study of Stradaptlve Ability Testing . 
December 1975. (AD A020961). 

AD Numbers are those assigned by the Defense Documentation Center, 
for retrieval through the National Technical Information Service. 



ERIC 



Copies of these reports are available, while supplies last, from: 

Psychometric Methods Program 
Department of Psychology 
University of Minnesota 
Minneapolis » Minnesota 55455 



44 



